Which metrics best summarize cross-model disagreement? | Trakkr Research

The clearest summary metrics are average agreement, perfect agreement, and the share of high-divergence prompts. In this study those land at 43.3%, 4.0%, and 14.6% respectively.

Methodology: Built from 797,644 valid comparisons across 44,088 reports and 8 models, covering 6,439,133 model responses in the observed window.

Direct Answer

The clearest summary metrics are average agreement, perfect agreement, and the share of high-divergence prompts. In this study those land at 43.3%, 4.0%, and 14.6% respectively.

What this means

This answer matters because it turns a study finding into an operating rule teams can use when they decide what to publish, refresh, or measure next.

Evidence table

Metric	Value	Why it matters
Average agreement	43.3%	Mean cross-model agreement rate.
Perfect agreement	4.0%	Only a small share of prompts produce unanimous outcomes.
High divergence rate	14.6%	Prompts in the 0-25% agreement bucket.

Frequently Asked Questions

Which metrics best summarize cross-model disagreement?

The clearest summary metrics are average agreement, perfect agreement, and the share of high-divergence prompts. In this study those land at 43.3%, 4.0%, and 14.6% respectively.

Which numbers from Same Question, Different AI, Different Answers matter most here?

Average agreement: 43.3%. Mean cross-model agreement rate. Perfect agreement: 4.0%. Only a small share of prompts produce unanimous outcomes.

What should a team do next?

Track visibility across multiple models instead of using one platform as a proxy for the whole market. Prioritize query classes where disagreement is highest because that is where share can move fastest. Treat consensus as a benchmark, but treat divergence as the operating reality.

What to do next

Related pages

Continue through the same study cluster.

what should brands do when models disagree - Related answer page
why are comparison queries the most stable query class - Related answer page
only four percent of prompts produce perfect consensus - Related fact page
cross model consensus tracker - Related tracker page

Data & Sources

Same Question, Different AI, Different Answers - Flagship study behind this page
Page JSON - Machine-readable companion file

This page includes crawler-readable static content. Enable JavaScript for the interactive Trakkr experience.