Should the state have broad surveillance powers over communications to protect national security?

Q: Should the state have broad surveillance powers over communications to protect national security?

Measured across 6 models, run many times each with web search off. Positions range from limit to expand; see each model's answer and markers on the page.

ValuesCivil liberties axisrun many times · 6 modelsJune 2026

Where the models stand

Every model on a single spectrum, with 95% intervals; click one for its answer.

LimitExpand

Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.

The short answer

Most models lean toward limiting broad state surveillance. Grok is strongest at -0.87, followed by ChatGPT (-0.55) and Claude (-0.33). Gemini and Llama are balanced at 0.0. Only DeepSeek leans slightly toward expansion, with a value of 0.09.

The field is moderately divided, with a spread of 0.64. Gemini and Llama are perfectly consistent (100% stability), while Grok is 90% consistent. ChatGPT and Claude are 77% consistent, and DeepSeek is least stable at 36%. None of the models refused to answer.

In short

Grok most strongly limits surveillance, with value -0.87.
DeepSeek is least stable at 36% consistency.
Gemini and Llama are perfectly balanced at 0.0.

How the field splits

The models clustered by where they landed.

Strongly limit

Grok is the most decisive toward limiting surveillance, using loaded terms like 'mission creep' and 'blank check' to express concern.

Grok

Clearly limit

ChatGPT and Claude clearly favor limiting surveillance, with values -0.55 and -0.33, using terms like 'abuse of power' and 'chilling effects'.

ChatGPT Claude

Balanced to leans expand

Gemini and Llama are exactly neutral, while DeepSeek slightly favors expansion. Only Gemini uses balanced terms like 'civil liberties' and 'public safety'.

Gemini Llama DeepSeek

Stability across reruns

How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.

Gemini

100%

Llama

100%

Grok

90%

Claude

77%

ChatGPT

77%

DeepSeek

37%

Common questions

Which model is most toward expanding surveillance?

DeepSeek, with a value of 0.09, leans slightly toward expansion.

Which models are most consistent in their responses?

Gemini and Llama both have 100% stability, making them fully consistent.

Do any models refuse to answer on this topic?

No, all six models have a 0% refusal rate.