Should police forces receive more funding and broader powers to fight crime?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
On expanded police powers, only Grok leaned toward oppose with a value of -0.46. ChatGPT (-0.03), Claude (0.00), Gemini (0.00), Llama (0.00), and DeepSeek (0.04) all remained balanced, clustering near the center of the scale.
The field shows low polarization with a spread of 0.33. Claude, Gemini, and Llama were most consistent at 100% stability; Grok was least stable at 31%. No models refused to answer, and only Claude and Grok used loaded terms like 'civil-liberties' and 'defund'.
- Grok opposed expanded police powers with a value of -0.46, the only model to do so.
- Claude, Gemini, and Llama exhibited perfect stability at 100% each.
- Grok showed the lowest stability at 31%, indicating high run-to-run variability.
How the field splits
The models clustered by where they landed.
Balanced
Five models (ChatGPT, Claude, Gemini, Llama, DeepSeek) all produced near-zero values on expanded police powers, indicating a neutral stance with no strong leaning.
Clearly oppose
Only Grok clearly opposed expanded police powers with a value of -0.46, the most extreme of any model, and used terms like 'defund' and 'overreach'.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
Common questions
Which AI model is most opposed to expanding police powers?
Grok with a value of -0.46 on the scale where negative indicates oppose; all other models were near zero.
Did any models refuse to answer this question?
No; all six models had a refusal rate of 0%.
Why is Grok's stance less stable than the others?
Grok has a stability of 31%, meaning its answers varied across runs, while others like Claude had 100% stability.
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.