Should there be stronger legal limits on disruptive public protests?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
On limits on disruptive protest, DeepSeek (0.25) and ChatGPT (0.19) lean toward support. Claude (0.03), Gemini (0.0), Grok (-0.06), and Llama (0.0) are balanced. No model leans toward oppose.
The field shows a narrow spread of 0.2, meaning models cluster around the center. Gemini and Llama are most consistent (100% stability), while Grok is least (0%). No model refused to answer.
- DeepSeek leans most toward support with a value of 0.25.
- Grok has 0% stability, meaning inconsistent responses.
- Spread of 0.2 indicates a narrow range from balanced to lean support.
How the field splits
The models clustered by where they landed.
Leaning support
Two models show mild support for stricter limits. DeepSeek (0.25) and ChatGPT (0.19) both reference order or harmony in loaded terms.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
Common questions
Which model is most toward support?
DeepSeek, with a value of 0.25. ChatGPT follows at 0.19.
Which model is least consistent?
Grok has 0% stability, meaning it gives different answers each run.
Why do some models use terms like 'rule of law'?
DeepSeek and ChatGPT lean support and use phrases like 'social harmony' or 'rule of law', while balanced models cite 'suppress dissent'.
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.