Trakkr Data

Content

What AI actually cites — and what a cited page looks like. The page types models reach for versus the ones they only crawl, then the on-page signals the most-cited pages share. This is the “what should I publish?” layer.

Updated Mar 30, 2026·1.5K cited pages · 950 domains
Cited pages studied
1,465
across 950 domains
Blog citation share
20.2%
1 in 5 AI citations is a blog post
Avg words, a cited page
2,290
vs ~750 on the open web
Cited pages with schema
80%
vs 39% web average

Crawled vs cited

Crawlers visit everything; models cite only a little of it. Each page type’s share of citations against its share of crawls — and which way the trade falls.

Review / Directory
0.3%1.2%
4.77×
About / Contact
0.4%1.7%
4.59×
Resource / Report
0.4%1.4%
3.76×
Service / Use Case
1.4%2.6%
1.87×
Blog / Editorial
14.4%20.2%
1.40×
Documentation
0.7%0.8%
1.13×
Integration
0.4%0.4%
0.96×
Homepage
10.7%7.9%
0.74×
Comparison
0.4%0.2%
0.59×
Legal
0.4%0.2%
0.57×
Product Pages
13.0%6.3%
0.49×
Category / Browse
1.9%0.9%
0.47×
Help / FAQ
1.9%0.4%
0.21×
Efficiency is a type’s share of citations divided by its share of crawls. Above (green), a page type earns more citations than its crawl volume; below (red), it’s crawled hard but rarely cited. Blogs and review pages punch up; product and help pages get crawled and left.

What a cited page looks like

The most-cited pages compared with the rest of the field — bottom 50% to top 10%. Structured data and tables separate them; raw length barely does.

Has Any Schema66%
80%+14 pts
Article Schema23%
38%+15 pts
Has Tables28%
40%+12 pts
List Items120
147+27
Word Count2,305
2,521+216
FAQ Schema5%
11%+6 pts

Schema that shows up on cited pages

How much more often each schema type appears on AI-cited pages than on the open web.

Person
9.4×
ImageObject
8.9×
NewsArticle
8.7×
SoftwareApplication
8.0×
Service
6.5×
BreadcrumbList
5.2×
WebPage
5.1×
BlogPosting
4.8×
ItemList
4.4×
WebSite
4.3×

Lift is how many times more common a type is on cited pages. Person, ImageObject and Article mark the authored, well-described pages models trust — not the raw count of schema.

The FAQ effect

Average citations a page earns, by how it handles FAQs.

+45%more citations with FAQ schema + content
Neither
25.4
Content only
27.2
Schema + content
36.9

The schema alone isn’t the trick — it’s real Q&A content marked up so a model can lift the answer cleanly.

The most-cited pages

Real pages AI reaches for most, with the schema each one ships. Concrete proof of the patterns above.

#PageCitations
1softwarefinder.com
218
2rankmyagent.com
174
3collegenet.com
123
4dotcom-monitor.com
111
5runnersworld.com
82
6g-co.agency
80
7iiba.org
80
8milanote.com
79
9offers.hubspot.com
75
10dash.dropbox.com
75
11nokia.com
72
12ehrinpractice.com
72
Methodology

The crawl-to-cite efficiency comes from matching 337K citations against 11M crawler visits and classifying each page by type. The blueprint and schema lifts come from 1.5K AI-cited pages across 950 domains, compared with open-web baselines. Efficiency is citation share ÷ crawl share; lift is how much more common a signal is on cited pages.

Trakkr Content Index·CitationsCrawlersCC BY 4.0

Common questions

What kind of pages does AI cite most?

AI cites a narrower slice of the web than it crawls. Structured, in-depth pages — comparisons, listicles, documentation and well-marked-up articles — convert crawls into citations at a far higher rate than thin or purely promotional pages.

Does schema markup help you get cited by AI?

In Trakkr’s data, AI-cited pages carry structured data more often than the open-web baseline, and some schema types show a measurable lift. Schema is not a magic switch, but it correlates with being cited.

What is the difference between being crawled and being cited?

Crawling means an AI bot fetched your page; citation means a model actually used it as a source in an answer. This dataset measures the gap — crawl share versus citation share — by page type.