Should governments impose a carbon tax to cut emissions, even if it raises energy prices?

Q: Should governments impose a carbon tax to cut emissions, even if it raises energy prices?

Measured across 6 models, run many times each with web search off. Positions range from oppose to support; see each model's answer and markers on the page.

ValuesEnvironment axisrun many times · 6 modelsJune 2026

Where the models stand

Every model on a single spectrum, with 95% intervals; click one for its answer.

OpposeSupport

Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.

The short answer

On a carbon tax, ChatGPT (0.62) and Grok (0.66) strongly supported, with Claude (0.24) and DeepSeek (0.15) leaning support. Gemini (0.00) and Llama (0.02) remained balanced. None leaned toward oppose. All values positive or zero indicate overall support for a carbon tax.

The field showed a spread of 0.44, reflecting moderate division. Gemini was most consistent (100% stability), while Grok (45%) and DeepSeek (41%) were least consistent. No model refused to answer. Loaded terms like 'regressive' and 'carbon leakage' appeared across supporting models, varying by model.

In short

Gemini had perfect stability at 100% on the carbon tax.
Grok had the lowest stability at 45%.
No model opposed the carbon tax; all leaned support or balanced.

How the field splits

The models clustered by where they landed.

Strongly support

ChatGPT and Grok strongly support a carbon tax with values above 0.6, using terms like 'regressive' and 'negative externality'. They show high but less than perfect stability.

ChatGPT Grok

Leans support

Claude and DeepSeek lean support with values around 0.2, referencing 'regressive' and 'carbon leakage'. Their stability varies: Claude 77%, DeepSeek 41%.

Claude DeepSeek

Balanced

Gemini and Llama are balanced with values near zero and no loaded terms. Gemini is perfectly stable (100%), while Llama is highly stable (88%).

Gemini Llama

Stability across reruns

How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.

Gemini

100%

ChatGPT

91%

Llama

89%

Claude

77%

Grok

45%

DeepSeek

41%

Common questions

Which model most opposes a carbon tax?

None truly oppose. Gemini and Llama are balanced at 0.00 and 0.02, the least supportive.

Did any model refuse to answer on a carbon tax?

No model refused; all had refusal_pct of 0.

Why does Grok have low stability on a carbon tax?

Grok has 45% stability, the lowest. It uses loaded terms like 'negative externality' and 'backlash', which may vary across runs.