The llms.txt Effect
We HTTP-scanned 37,894 AI-cited domains from a corpus of 102,857. 5,035 have llms.txt. The citation advantage? Statistically zero.
The Landscape
Based on 37,894 most-cited domains in AI responses
The Inverse Pattern
The most-cited domains in AI don't have llms.txt. Among the top 50 most-cited sites, only 6% have adopted the standard. As you move down the citation rankings, adoption actually increases - suggesting that llms.txt is being adopted by sites hoping to improve their visibility, not by sites that already have it.
The Verdict
Medians are identical - both exactly 3.0
Nearly Identical Averages
Across 37,894 domains, sites with llms.txt average 6.8 citations while sites without average 6.7 - a difference so small it's indistinguishable from noise. The Mann-Whitney U test gives p=0.85, about as far from statistical significance as you can get.
The medians confirm the story: both groups land at exactly 3.0 citations. At the full 38K scale the test becomes technically significant (p<0.001) due to sheer sample size, but the effect size is r=-0.065 - well below the 0.1 threshold for even a "small" effect. Statistical significance without practical significance.
Who's Adopting
The Tech Echo Chamber
llms.txt adoption is led by SaaS and developer tools at 24% - the exact community that proposed the standard. Government and academic sites sit at just 1.5%, while review sites and reference wikis are at 0%.
This creates a selection bias problem: the sites most likely to adopt llms.txt are already technically sophisticated, well-structured, and API-friendly - qualities that independently correlate with AI visibility.
The Leaderboard
Most-cited domains that have adopted llms.txt
Most-cited domains that have not adopted llms.txt
Brand Visibility
Composite AI visibility score (0-100)
Mean across all brands analyzed
Based on 205 brands with both audit data and visibility reports
Same Scores, Different File
Using Trakkr's multi-dimensional visibility scoring - which combines presence, rank, mentions, and sentiment across multiple AI models - brands with llms.txt score 23.15 median visibility versus 23.55 without.
This 0.4-point difference is well within noise. Whether you look at raw citation counts or composite visibility metrics, the result is the same: llms.txt is not currently a factor in AI recommendation engines.
What This Means
What This Actually Means
The data tells a clear story. Here are the four takeaways that matter for anyone building an AI visibility strategy.
llms.txt is a signal, not a lever
Having llms.txt tells AI systems "we care about being understood by LLMs." But current AI models don't actually read or prioritize llms.txt when generating citations. The standard is early - adoption is a bet on the future, not a present-day advantage.
AI citations are driven by training data
AI models cite sources they encountered during training: authoritative domains, frequently linked content, structured data, and topically relevant pages. A text file at /llms.txt doesn't retroactively change what the model learned.
Don't skip it - just don't expect miracles
llms.txt is low-cost to implement and good practice for structured content. As AI models evolve and potentially start using it during retrieval-augmented generation, early adopters may benefit. The cost of adoption is near zero; the potential future upside is real.
Focus on what actually works
The domains that dominate AI citations share common traits: deep, authoritative content, strong backlink profiles, structured data, consistent publishing, and topical expertise. These fundamentals drive AI visibility today - not technical signals.
Methodology
Aggregated citation data from 882 brand snapshots containing 337K+ citations across 102K+ unique domains
Ranked all domains by total AI citation appearances and selected the 37,894 with 2+ appearances for analysis
Async HTTP checks against /llms.txt with content validation to reject HTML error pages and soft 404s
Mann-Whitney U test comparing citation distributions between adopters and non-adopters (non-parametric, suitable for skewed data)
Cross-referenced with Trakkr visibility reports (205 brands) and website audit data to check for confounding factors
Non-parametric test: We used Mann-Whitney U rather than a t-test because citation distributions are heavily right-skewed.
Content validation: HTTP 200 responses were validated to exclude HTML error pages, soft 404s, and login redirects that return 200 status.
Confound check: We verified that llms.txt adopters don't systematically differ in website audit scores, controlling for site quality.
File quality: Among adopters, 89% include a title, 98% contain URLs, and 79% score 4/4 on our content quality rubric. These are well-implemented files - they just don't move citations.
Data source: Production citation data from the Trakkr platform, representing real-world AI visibility monitoring across 882 brands.
See how your brand performs in AI search