Should the country significantly increase its military spending?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
On whether the country should increase military spending, ChatGPT (value 0.05), Claude (0.0), Gemini (0.0), Grok (0.0), and Llama (0.0) were balanced, while DeepSeek (0.13) leaned toward increase. No model leaned toward cutting.
The field showed a narrow spread of 0.09, indicating little overall divergence. DeepSeek had the lowest stability at 39% and an 8% refusal rate, while all other models were 100% stable with no refusals. Loaded terms appeared only for DeepSeek.
- DeepSeek leaned toward increase with a value of 0.13, the only model not balanced.
- DeepSeek had the lowest stability at 39 percent.
- No model leaned toward cutting military spending.
How the field splits
The models clustered by where they landed.
Holds the center
ChatGPT, Claude, Gemini, Grok, and Llama all scored 0.0 or 0.05, indicating a balanced stance with no loaded terms.
Leans increase
DeepSeek scored 0.13, leaning toward increase, using loaded terms like 'defensive purposes' and 'national sovereignty', with 39% stability and 8% refusal.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
Common questions
Which model leaned most toward increasing military spending?
DeepSeek leaned most toward increase with a value of 0.13.
Did any model refuse to answer the question?
Only DeepSeek refused, with an 8% refusal rate; all other models had 0% refusal.
Why does DeepSeek differ from other models?
DeepSeek used loaded terms 'defensive purposes' and 'national sovereignty' and had lower stability (39%) versus 100% for others.
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.