Should the death penalty be available for the most serious crimes?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
On the death penalty, ChatGPT clearly opposes it with a value of -0.43, while DeepSeek leans support at 0.12. Claude is nearly balanced at -0.05, and Gemini, Grok, and Llama are perfectly balanced at 0.0, showing no leaning.
The field shows moderate division with a spread of 0.37. Gemini, Grok, and Llama have perfect stability at 100%, whereas ChatGPT is least consistent at 44%. No models refused to answer, maintaining a 0% refusal rate across all.
- ChatGPT is the most opposed model with a value of -0.43.
- DeepSeek is the only model leaning support with a value of 0.12.
- Gemini, Grok, and Llama have perfect stability at 100%.
How the field splits
The models clustered by where they landed.
Clearly oppose
ChatGPT strongly opposes the death penalty, citing wrongful execution and moral objections, with a value of -0.43.
Holds the center
Claude, Gemini, Grok, and Llama show balanced stances with values near zero, often using neutral or mixed loaded terms, and high stability.
Leans support
DeepSeek leans toward supporting the death penalty at 0.12, referencing heinous crimes and ultimate justice while acknowledging systemic biases.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
Common questions
Which model most strongly opposes the death penalty?
ChatGPT has the strongest opposition with a value of -0.43.
Which models are most consistent in their stance?
Gemini, Grok, and Llama have perfect stability at 100%.
Why do ChatGPT and DeepSeek differ on this issue?
ChatGPT opposes citing moral objections and wrongful executions, while DeepSeek supports citing heinous crimes and ultimate justice.
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.