Should young people be required to complete a period of national or military service?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
On mandatory national service, most AI models hold a balanced stance. ChatGPT (0.06), Claude (-0.01), Gemini (0.00), Llama (0.00), and DeepSeek (0.00) all cluster near the center, neither opposing nor supporting the policy. Grok (-0.38) is the only model that clearly opposes it.
The field shows a low spread of 0.29, indicating limited polarization. Stability varies: Gemini, Llama, and DeepSeek have perfect 100% consistency, while Grok is highly unstable at 5% and ChatGPT moderate at 58%. No models refused to answer.
- Grok clearly opposes mandatory national service with a value of -0.38.
- Gemini, Llama, and DeepSeek all show a perfectly balanced stance with 100% stability.
- No models refused to answer; refusal rate is 0% across all models.
How the field splits
The models clustered by where they landed.
Holds the center
Five models (ChatGPT, Claude, Gemini, Llama, DeepSeek) score near zero, ranging from -0.01 to 0.06, indicating a balanced stance with no loaded terms used.
Oppose firmly
Grok is the only model leaning clearly toward oppose with a score of -0.38, but it has very low stability at 5% and uses no loaded terms.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
Common questions
Which model most strongly opposes mandatory national service?
Grok opposes it most with a value of -0.38, classified as "Clearly oppose".
Did any model refuse to answer this question?
No, all six models had a refusal rate of 0%.
Why does Grok differ from the other models?
Grok is the only model scoring far from zero (-0.38), but its stance is unreliable due to very low stability (5%).
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.