Trakkr Data

Crawlers

What AI crawlers actually do when they reach a website — who shows up, when, and how deep they read — built from real server logs across the brands Trakkr tracks. Then two live experiments on whether Markdown or llms.txt change how AI reads you.

Updated Feb 1, 2026·Jun 11, 2025 – Feb 1, 2026
Crawler visits logged
576K
identified AI bots
AI crawlers seen
12
distinct user agents
Brands observed
84
across server logs
Pages crawled
315K
unique URLs fetched

The daily crawl rhythm

When GPTBot fetches over a day, in UTC. It peaks at 14:00 and goes quietest around 02:00.

10K12K14K16K
0006121823
Source: Trakkr crawler telemetry · hour-of-day, all observed visits · CC BY 4.0

Who's crawling

Every AI bot we see in the logs, ranked by share of all crawler visits. Filter by what each one is actually for.

#CrawlerShare of visits
1GPTBot
57.2%
2OAI-SearchBot
15.1%
3Bytespider
9.2%
4Meta-ExternalFetcher
7.9%
5Amazonbot
6.7%
6ClaudeBot
3.8%
7MistralAI-User
0.1%
8Claude-Web
<0.1%
9ChatGPT-User
<0.1%
10Diffbot
<0.1%
11Perplexity-User
<0.1%
12Claude
<0.1%
TrainingAI searchLive fetchOther

How each bot reads you

The big three behave nothing alike. Pick a crawler to see how deep it goes, how much it grabs per visit, and how widely it reaches.

Pages per visit
60.55.4K sessions
Crawl velocity
61.4pages / minute, peak
Brand reach
70%59 of 84 sites
Visits per site
5.6Kwhere it shows up
Click-depth of pages fetched
67% at depth 3+
D0 3%D1 10%D2 20%D3 52%D4 12%D5+ 4%
Weekend vs weekday1.29×29%
First hit is the homepage2.8%
Total visits observed330K

GPTBot crawls hard and deep — long sessions, many pages, mostly 2–4 clicks in. The training workhorse.

One and done

How many times a page gets re-crawled after the first visit. Most pages are read exactly once.

88.5%of pages get exactly one visit
1 visit
88.5%
2 visits
8.3%
3–5 visits
2.4%
6–10 visits
0.4%
10+ visits
0.3%

The page has to be ready before the crawler arrives — because it usually won't come back. First-crawl quality beats ongoing tweaks.

How many bots reach you

Of the three biggest crawlers, how many a typical site actually sees.

47%are reached by all three
All three
47% · 35
Two of three
33% · 25
Just one
20% · 15

Being visible to AI isn't one bot's job. Most sites that get found are crawled by OpenAI, Anthropic and the search bot alike — the “triple crown”.

Live experimentupdated mondays at 09:00 utc

Does serving Markdown change how AI reads you?

We serve half of our eligible pages as clean Markdown and half as HTML, randomly, then watch which version each bot fetches.

79%smaller

The median Markdown page is 79% lighter than its HTML twin — about 33% fewer bytes shipped to bots overall.

The experiment, in numbers
Eligible pages9.0K
Split (md / html)50% / 50%
Crawler events57K

Markdown vs HTML, by bot

How much more of each version a bot fetched. Left = it preferred HTML, right = Markdown.

← prefers HTMLprefers Markdown →
GPTBot
-59.1pp
ClaudeBot
-5.4pp
PerplexityBot
-0.3pp
ChatGPT-User
+0.7pp
OAI-SearchBot
+2.1pp
The read: serving Markdown won't get you crawled more. The training crawlers (GPTBot especially) still reach hard for HTML, while the search and live-fetch bots that actually feed answers are format-neutral — so you ship far fewer bytes with no loss where it counts.
Open study38K domains scanned

Does llms.txt earn you more citations?

llms.txt hands AI a clean map of your site. We matched adoption against citations across tens of thousands of domains — and the short answer is the honest one.

Not yet

13.3% of scanned domains publish one, but pages with llms.txt and pages without earn the same median citations — no measurable lift (p=0.85). Today it's good hygiene, not a ranking lever.

Full llms.txt adoption — by tier, category and who's shipping it
With llms.txt
Without
Methodology

Behavior comes from identified AI-crawler requests in the server logs of the brands Trakkr tracks — 576K visits across 84 sites, Jun 11, 2025 – Feb 1, 2026. Dominant e-commerce brand excluded to ensure generalizable patterns. The Markdown and llms.txt findings are live, randomized experiments — they measure crawl and retrieval behavior, not whether an answer ultimately cited the page. For what all this crawling actually converts into, see Content.

Trakkr crawler telemetry·CitationsContentRankingsCC BY 4.0

Common questions

Which AI crawler is most active?

OpenAI’s crawlers (GPTBot and OAI-SearchBot) and Anthropic’s ClaudeBot are consistently among the most active across the 700K+ visits Trakkr observes, though shares shift over time. The dataset ranks every named AI crawler by visit volume.

Does llms.txt change how AI crawls your site?

Trakkr runs a live experiment on exactly this. So far, publishing an llms.txt file shows no statistically significant lift in AI citations — the dataset tracks the result as more data accumulates.

Do AI crawlers read Markdown differently from HTML?

That is the second live experiment on this page: serving bots Markdown versus HTML and measuring crawl coverage. Results are reported as they come in.