Average top-three overlap is 2.8 | Trakkr Research

The 'Same Question, Different AI, Different Answers' study evaluates the consistency of AI model outputs by measuring the average overlap among the top three results generated by different models.

Methodology: Built from 797,644 valid comparisons across 44,088 reports and 8 models, covering 6,439,133 model responses in the observed window.

Claim

The average top-three result overlap across evaluated AI models is 2.8 in the model divergence benchmark.

Why it matters

Operators and strategists should interpret this 2.8 overlap score as an indicator that while consensus exists among category leaders, relying on a single model obscures critical differences in inclusion and ranking. Multi-model optimization is required to guarantee comprehensive visibility.

Supporting metrics

Metric	Value	Context
Average top 3 overlap	2.8	Average overlap among top-three results across models.

Continue through the same study cluster.

why do models disagree so much even on common categories - Related answer page
what is the operational cost of model divergence - Related answer page
cross model consensus tracker - Related tracker page

Data & Sources

Same Question, Different AI, Different Answers - Flagship study behind this page
Page JSON - Machine-readable companion file

Average top-three overlap is 2.8 | Trakkr Research

Claim

Why it matters

Supporting metrics

Related pages

Data & Sources