Day 22 - Why Do LLMs Hallucinate? The Role of Token Bias

Context:

I used to wonder about the term “hallucinations” in AI:

After reading and reflecting, I discovered two key concepts behind hallucinations:

Token Bias:
- Models tend to choose certain words more often than others based on training patterns, not necessarily logic.
- When given a prompt, the model calculates the most likely next word. Sometimes, it picks a word because it’s “used to” it, not because it’s correct.
- Example: Prompt: “5 of the mangoes are smaller out of 10 mangoes” The word “smaller” might make the model think of “subtract” or “minus,” even if that’s not intended.
Hallucination:
- The visible mistake in the output — when the model gives a wrong answer.

Key Insight:

Statistical Pattern Matching → Token Bias → Hallucinations

Hallucinations are not random; they stem from how models learn and predict.

Understanding token bias helps testers design prompts that reduce ambiguity.
Knowing the root cause of hallucinations aids in validating AI outputs for correctness.
Essential for testing AI in critical workflows where accuracy matters.

Hallucinations aren’t magic errors — they’re a byproduct of statistical learning and probabilistic prediction.

My Journey of Testing, Learning & Growing!