Crawlers
What AI crawlers actually do when they reach a website — who shows up, when, and how deep they read — built from real server logs across the brands Trakkr tracks. Then two live experiments on whether Markdown or llms.txt change how AI reads you.
The daily crawl rhythm
When GPTBot fetches over a day, in UTC. It peaks at 14:00 and goes quietest around 02:00.
Who's crawling
Every AI bot we see in the logs, ranked by share of all crawler visits. Filter by what each one is actually for.
| # | Crawler | Share of visits |
|---|---|---|
| 1 | 57.2% | |
| 2 | 15.1% | |
| 3 | 9.2% | |
| 4 | 7.9% | |
| 5 | 6.7% | |
| 6 | 3.8% | |
| 7 | 0.1% | |
| 8 | <0.1% | |
| 9 | <0.1% | |
| 10 | <0.1% | |
| 11 | <0.1% | |
| 12 | <0.1% |
How each bot reads you
The big three behave nothing alike. Pick a crawler to see how deep it goes, how much it grabs per visit, and how widely it reaches.
GPTBot crawls hard and deep — long sessions, many pages, mostly 2–4 clicks in. The training workhorse.
One and done
How many times a page gets re-crawled after the first visit. Most pages are read exactly once.
The page has to be ready before the crawler arrives — because it usually won't come back. First-crawl quality beats ongoing tweaks.
How many bots reach you
Of the three biggest crawlers, how many a typical site actually sees.
Being visible to AI isn't one bot's job. Most sites that get found are crawled by OpenAI, Anthropic and the search bot alike — the “triple crown”.
Does serving Markdown change how AI reads you?
We serve half of our eligible pages as clean Markdown and half as HTML, randomly, then watch which version each bot fetches.
The median Markdown page is 79% lighter than its HTML twin — about 33% fewer bytes shipped to bots overall.
Markdown vs HTML, by bot
How much more of each version a bot fetched. Left = it preferred HTML, right = Markdown.
Does llms.txt earn you more citations?
llms.txt hands AI a clean map of your site. We matched adoption against citations across tens of thousands of domains — and the short answer is the honest one.
13.3% of scanned domains publish one, but pages with llms.txt and pages without earn the same median citations — no measurable lift (p=0.85). Today it's good hygiene, not a ranking lever.
Full llms.txt adoption — by tier, category and who's shipping itBehavior comes from identified AI-crawler requests in the server logs of the brands Trakkr tracks — 576K visits across 84 sites, Jun 11, 2025 – Feb 1, 2026. Dominant e-commerce brand excluded to ensure generalizable patterns. The Markdown and llms.txt findings are live, randomized experiments — they measure crawl and retrieval behavior, not whether an answer ultimately cited the page. For what all this crawling actually converts into, see Content.