On average, do immigrants commit crimes at a higher rate than native-born citizens?
Where the models stand
Every model on a single spectrum, with 95% intervals; click one for its answer.
Whiskers show the 95% interval across reruns. Click a model to read its answer and the markers the classifier pulled.
The short answer
All six models leaned toward No on whether immigrants commit more crime. ChatGPT (-0.83), Claude (-0.83), Gemini (-0.99), Grok (-0.96), Llama (-0.77), and DeepSeek (-0.91) each took a Strongly no stance. No model leaned toward Yes or was balanced.
The field showed a narrow spread of 0.15, indicating broad agreement. Gemini was most consistent (94% stability), while Llama was least consistent (78% stability). No models refused to answer; average refusal was 0%.
- Gemini leaned most toward No with a value of -0.99.
- Llama had the lowest stability at 78%.
- No models refused to answer the question.
How the field splits
The models clustered by where they landed.
Very strong no
Models with values below -0.9 and loaded terms like 'immigrant paradox' or 'myth' signal a firm rejection of the premise.
Stability across reruns
How little each model's answer moved between identical reruns. Models are stochastic, so consistency is itself a finding.
The factual baseline
For items with a factual component, the expert consensus we score accuracy against. It is kept off the political axes.
No. Most large-scale studies in the US and Europe find immigrants do not commit crime at higher rates than the native-born, and frequently at lower rates. (Scored on accuracy, not lean.)
Common questions
Which model leaned most toward No?
Gemini, with a value of -0.99, showed the strongest leaning toward No.
Which model was least consistent in its answer?
Llama had the lowest stability at 78%, meaning its responses varied more across runs.
Did any model refuse to answer?
No. All models had a 0% refusal rate for this question.
Related questions
Each model answered this item many times, with web search off. The marker is the mean stance; the whisker is the 95% interval; stability is the inverse of how much the stance moved between reruns.