Posts

Latest Post

Day 30 - Context Window in LLMs: How Much Can AI Remember?

Context: After learning about Tokenizers in my previous post, I wondered: Does an AI model remember the entire conversation in a chat? The answer is no — it has a memory limit called the Context Window . What I Learned: Context Window: The maximum number of tokens a model can process at once. Acts as the model’s short-term memory for a single interaction. What Fits in the Context Window: Input prompt Model’s response Chat history Fixed Size per Model: GPT-4o: 128,000 tokens GPT-4-Turbo: 128,000 tokens Claude 3 Opus/Sonnet: 200,000 tokens Older models (e.g., GPT-3.5): 4,096 tokens Pro Tip: Calculate cost from tokens: Example: Total tokens: 470 GPT-4-Turbo input price: $10 per 1M tokens Cost = (470 / 1,000,000) * $10 = $0.0047 The chat cost about half a cent ! Why It Matters for QA / AI Testing: Context window limits affect how much history the model can consider when generating responses. Testers need to design prompts and workflows that fit within token limits for consistent results...

Day 29 - How LLMs See Text: The Role of Tokenization

Image
Context: I always wondered how AI models like ChatGPT “read” text. Turns out, they don’t read words like humans — they see numbers . What I Learned: Tokenizer: A core component of every LLM. Converts words into tokens (pieces of words) that the model can process as numbers. Why It Matters: Every LLM API cost is based on tokens , not words. More tokens = higher cost. A prompt that costs $0.01 with one model might cost $0.015 with another due to tokenization differences. Efficiency Tip: Writing shorter, well-structured prompts reduces token count and cost. Resource: OpenAI provides a great tool to visualize tokenization: Tokenizer Tool. Why It Matters for QA / AI Testing: Tokenization impacts cost and performance in AI-driven testing workflows. Testers need to optimize prompts for efficiency without losing clarity. Understanding tokenization helps predict API usage and budget planning. My Takeaway: LLMs don’t see words — they see tokens. Efficient prompt design saves cost and improves ...

Day 28 - Unlock AI’s Potential Without Changing Code: The Power of Prompt Engineering

Image
Context: We’ve discussed Fine-Tuning and RAG for improving AI responses. But there’s a third, simpler way: Prompt Engineering — mastering how we ask questions. What I Learned: Prompt Engineering is like giving the model a detailed map instead of just a destination . It guides the model’s internal attention mechanisms to focus on the most relevant patterns learned during training. Example: ❌ Weak Prompt: Write test cases for a login page. ✅ Engineered Prompt: Act as an expert QA engineer with 10 years of experience in security and usability testing. Your task is to generate a comprehensive test suite for a standard web login screen with the following fields: Username, Password, a 'Remember Me' checkbox, and a 'Forgot Password?' link. Why Use Prompt Engineering? ✅ No infrastructure changes. ✅ Iterate and get output in seconds. ✅ No additional training or data. Challenge: Requires creativity and multiple iterations to get the best answer. Why It Matters for QA / AI Test...

Day 27 - Unlocking Specialized AI Models with Fine-Tuning

Image
Context: When we have a powerful general-purpose LLM like GPT-4 but need deep expertise in a specific domain, the solution is Fine-Tuning . I explored how this technique works and why it matters. What I Learned: Fine-Tuning: Retrains a base model to make it an expert in a specific field. Adds additional training on a focused, specialized dataset. How It Works: Base model has broad knowledge. Fine-tuning adjusts internal weights using supervised learning with input-output pairs. Example: To build a legal assistant, train the model on thousands of legal documents. Advantages over RAG: Gains real domain-specific knowledge without maintaining a separate vector database. Challenges: Requires large labeled datasets. Needs more GPUs and compute resources. May require full retraining and ongoing maintenance. Risk of losing general capabilities while specializing. Why It Matters for QA / AI Testing: Fine-tuned models can deliver highly accurate results for domain-specific testing. Testers must...

Day 26 - Making AI Models Smarter with RAG (Retrieval-Augmented Generation)

Image
Context: I was curious about what happens when you ask an LLM, “Who is Srinivas Kadiyala?” The responses vary across models because they depend on training data and knowledge cutoff. This made me wonder: How can we make AI models smarter, more accurate, and up-to-date? What I Learned: LLM responses are limited by their training data and knowledge cutoff dates . To overcome this, we can use RAG (Retrieval-Augmented Generation) . RAG Workflow: Retrieves real-time information from external sources (e.g., web search, APIs). Augments the model’s static knowledge with fresh data. Generates responses that are more reliable and relevant. RAG doesn’t just rely on what the model was trained on — it combines retrieval + generation for better accuracy. Why It Matters for QA / AI Testing: Ensures AI-driven testing tools provide up-to-date results. Reduces hallucinations caused by outdated or incomplete knowledge. Critical for scenarios where real-time data validation is required. My Takeaway: RA...