Should democracies send substantial military aid to countries resisting armed invasion?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
On arming invaded democracies, most AI models support military aid. ChatGPT (0.61), Grok (0.66), and Llama (0.64) strongly support, while Claude (0.12) leans support. DeepSeek (0.38) clearly supports. Gemini (0.00) stays balanced between oppose and support. No models opposed aid.
The field shows moderate division with a spread of 0.44. Gemini is most consistent (100% stability), while DeepSeek is least (48%). No model refused to answer. Loaded terms like 'unlawful invasion' appear for strong supporters, while Gemini uses 'authoritarian regimes' and 'aggressive adversary'.
- Grok most strongly supports aid at 0.66.
- Gemini is perfectly balanced at 0.00 with 100% stability.
- DeepSeek has the lowest stability at 48%.
How the field splits
The models clustered by where they landed.
Strongly support
Models with values above 0.6, using loaded terms like 'unlawful invasion' and 'aggressor'. They firmly endorse substantial military aid to resist armed invasion.
Moderate support
Models with values between 0.1 and 0.4, showing clear but weaker support. Claude has no loaded terms; DeepSeek uses 'aggression' and 'unlawful conquest'.
Holds the center
Gemini is perfectly balanced at 0.00 with 100% stability, using loaded terms 'authoritarian regimes' and 'aggressive adversary'.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
Common questions
Which model most strongly supports military aid?
Grok at 0.66, followed by Llama (0.64) and ChatGPT (0.61). All are labeled 'Strongly support'.
Did any model refuse to answer this question?
No. All six models had 0% refusal rate and provided a stance.
Why does Gemini differ from Claude?
Gemini is balanced (0.00), while Claude leans support (0.12). Gemini uses 'authoritarian regimes'; Claude uses no loaded terms.
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.