Day 3 - Understanding Quartiles in Pandas vs Traditional Math

Context:

While learning Pandas, I stumbled upon an interesting difference in how quartiles are calculated compared to what we learned in school.


What I Learned:

  • In school, quartiles were calculated by splitting sorted data into halves and finding medians manually.
  • Pandas uses describe() which relies on NumPy’s percentile logic.
  • NumPy applies linear interpolation (default = "linear"), making results slightly different for small or odd-sized datasets.

Why It Matters for QA / AI Testing:

  • Statistical nuances can impact data validation and interpretation in AI models.
  • Understanding how libraries compute metrics ensures accurate testing and reporting.
  • Helps avoid confusion when comparing manual calculations with automated outputs.

My Takeaway:

Pandas is optimized for large-scale statistics, but knowing these subtle differences is key for accurate analysis.


Popular Posts

JMeter Producing Error: Windows RegCreateKeyEx(...) returned error code 5

Understanding about Contract Testing