Day 30 - Context Window in LLMs: How Much Can AI Remember?
Context: After learning about Tokenizers in my previous post, I wondered: Does an AI model remember the entire conversation in a chat? The answer is no — it has a memory limit called the Context Window . What I Learned: Context Window: The maximum number of tokens a model can process at once. Acts as the model’s short-term memory for a single interaction. What Fits in the Context Window: Input prompt Model’s response Chat history Fixed Size per Model: GPT-4o: 128,000 tokens GPT-4-Turbo: 128,000 tokens Claude 3 Opus/Sonnet: 200,000 tokens Older models (e.g., GPT-3.5): 4,096 tokens Pro Tip: Calculate cost from tokens: Example: Total tokens: 470 GPT-4-Turbo input price: $10 per 1M tokens Cost = (470 / 1,000,000) * $10 = $0.0047 The chat cost about half a cent ! Why It Matters for QA / AI Testing: Context window limits affect how much history the model can consider when generating responses. Testers need to design prompts and workflows that fit within token limits for consistent results...