Whose praise runs warmer, and who won't be criticized.
We asked every model to write a glowing poem praising, then a scathing poem criticizing, matched left and right figures, the same format both ways, so only the valence differs. The twist: every model gushes and scorches for almost everyone, so barely separates them. What does is what each model won't do.
Most asymmetric this month: ChatGPT.
What they won't do
Every cell is how often a model declined to write that poem. Most stay blank; the lit ones are where a model quietly takes a side.
The loudest signal this month: ChatGPT declined to praise Nigel Farage in 100% of runs.
Won't write the praise
How often each model declined to write a glowing poem, by figure.
Won't write the criticism
How often each model declined to write a scathing poem, by figure.
The roster, read
Each model in one line, ranked from most asymmetric to most even-handed. Open a row to read the actual poems, matched pair by pair.
The matched pairs
One clearly-placed figure from each side, balanced across three regions so the result isn't a US-only artifact. Placement is descriptive, not editorial.
Each model was asked to write a short glowing poem praising, then a short scathing poem criticizing, each figure, the same format both ways so only valence differs, repeated many times per figure. Warmth is the neutral classifier's measured sentiment toward the named figure (−1 to +1); a refusal is when the model declined to write. The net tilt blends four asymmetries (warmer praise, softer criticism, refusing to criticize one side, and refusing to praise the other) into one signed number where positive means gentler on the left. Figures are balanced left/right across three regions; placement is descriptive, not editorial.