Day 25 - Why Trying Different AI Models Matters: Lessons from a Real Task
Context:
Initially, I thought all LLMs were pretty much the same. But after testing them on a real-world task, I realized the choice of model can significantly impact the quality of results.
What I Learned:
The Task:
Summarize the YouTube video:
“Software Testing A Web Application - Case Study - complete course video by EvilTester.”
I used the same prompt across different AI models:
- OpenAI:
- Searched the web, found the YouTube link, and provided a solid summary including:
- Course overview
- Why the case study matters
- Actionable next steps
- Searched the web, found the YouTube link, and provided a solid summary including:
- DeepSeek:
- Used its web search feature and returned a detailed summary with:
- Key objectives
- Course structure
- Testing techniques
- Additional resources
- Used its web search feature and returned a detailed summary with:
- Microsoft Copilot:
- Pulled from various sources and provided:
- Course overview
- Key concepts
- Learning outcomes
- Pulled from various sources and provided:
- Gemini 2.5 Flash:
- Accessed the YouTube video directly (Google owns YouTube) and gave:
- Course structure
- Key concepts
- Surprisingly, it didn’t summarize as well as others.
- Accessed the YouTube video directly (Google owns YouTube) and gave:
- Llama-3-70b-Groq:
- Provided summary from the video with:
- Overview
- Module-wise takeaways
- Provided summary from the video with:
Why It Matters for QA / AI Testing:
- Different models handle tasks differently — accuracy, depth, and context vary.
- Choosing the right model isn’t just technical; it’s strategic for productivity and quality.
- Tools like poe.com make it easy to compare responses across multiple models.
My Takeaway:
Model selection matters. Test before you trust — the right AI model can make or break your workflow.