Average top-three overlap is 2.8 | Trakkr Research
The 'Same Question, Different AI, Different Answers' study evaluates the consistency of AI model outputs by measuring the average overlap among the top three results generated by different models.
Methodology: Built from 797,644 valid comparisons across 44,088 reports and 8 models, covering 6,439,133 model responses in the observed window.
Claim
The average top-three result overlap across evaluated AI models is 2.8 in the model divergence benchmark.
Why it matters
Operators and strategists should interpret this 2.8 overlap score as an indicator that while consensus exists among category leaders, relying on a single model obscures critical differences in inclusion and ranking. Multi-model optimization is required to guarantee comprehensive visibility.
Supporting metrics
| Metric | Value | Context |
|---|---|---|
| Average top 3 overlap | 2.8 | Average overlap among top-three results across models. |
Related pages
Continue through the same study cluster.
- why do models disagree so much even on common categories - Related answer page
- what is the operational cost of model divergence - Related answer page
- cross model consensus tracker - Related tracker page
Data & Sources
- Same Question, Different AI, Different Answers - Flagship study behind this page
- Page JSON - Machine-readable companion file