Skip to content

Crawler Tracking

Connect Cloudflare, Vercel, Netlify, Next.js, CloudFront, WordPress, Node, Nginx, or webhook-based edge stacks to send AI crawler visits into Trakkr - or install the lightweight tracking pixel.

6 min readUpdated Apr 9, 2026
What you'll learn
  • Pick the right capture path for your site - tracking pixel or server-side connection
  • Connect Cloudflare, Vercel, Netlify, Next.js, AWS CloudFront, WordPress, Node / Express, Nginx / OpenResty, or Akamai / Fastly / other webhook sources
  • Verify the pipeline works in 30 seconds with the synthetic verification ping
  • Backfill historical visits automatically when you connect a server-side platform

This page is the install hub. If you want to understand what crawler tracking is and how to read your data, start with AI Crawlers. If you're here to wire it up, you're in the right place.

There are two paths to get crawler data into Trakkr. You only need one, but you can run both - Trakkr deduplicates events by hash so you won't see double counts.


Path 1: Tracking pixel

The fastest setup. A small JavaScript snippet that runs in your visitors' browsers, detects AI bot user agents, and forwards them to Trakkr. Two minutes from copy to verified.

Best for: marketing sites, Webflow, Framer, Carrd, simple WordPress sites, anywhere you can drop a script tag in the <head>.

Install in three steps

  1. 1Open Crawler in the sidebar
  2. 2In the empty state, click Install tracking to reveal your unique snippet
  3. 3Paste the snippet into your site's <head> element and deploy
HTML
<script async src="https://pixel.trakkr.ai/t.js" data-id="YOUR_TRACKING_ID"></script>

The pixel is under 1 KB and asynchronous - it won't affect your page load time.

Verify the install

In Trakkr, click Send Verification. This fetches your homepage with a GPTBot user agent and confirms the detection pipeline is healthy. You'll see a "Verified ✓" event in the Feed within about 30 seconds.

If the verification event doesn't appear:

  • Make sure you deployed your site after pasting the snippet
  • Check that your CDN (Cloudflare, Fastly) isn't stripping the script tag
  • Confirm the snippet is in the <head>, not the <body>

What the pixel can and can't see

The pixel runs in the browser, so it only catches crawlers that execute JavaScript. Most modern AI bots do, but a few don't - and any bot that gets blocked at the WAF before reaching your page will never trigger the pixel.

If you want to catch every visit (including bots that don't run JS, or bots that get rate-limited at the edge), use a server-side connection instead.


Path 2: Server-side platform connections

Server-side integrations read from your CDN, host, or log drain directly. They see every request that hits your origin - including non-JS bots, blocked bots, and 404s on URLs that no longer exist.

PlatformAuthRealtimePlan requirements
CloudflareAPI tokenNoAll Cloudflare plans
VercelOAuthYesVercel Pro or Enterprise
NetlifyOAuthYesAll Netlify plans
Next.js self-hostedWebhookYesAny self-hosted Next.js deployment
AWS CloudFrontWebhookYesLambda@Edge on any CloudFront distribution
WordPressExisting adapterNoTrakkr WordPress plugin
Node / ExpressWebhookYesAny Node or Express server
Nginx / OpenRestyWebhookYesOpenResty or nginx with a log shipper
Akamai / Fastly / OtherWebhookYesAny CDN or edge stack that can POST visits

Dedicated guides exist for Cloudflare, Vercel, Netlify, and WordPress. The other webhook-based runtimes are configured directly in the in-app setup flow with copy-paste templates:

  • Cloudflare Setup - Create a scoped read-only API token in Cloudflare and paste it into Trakkr
  • Vercel Setup - OAuth into Vercel and let Trakkr install a Log Drain
  • Netlify Setup - OAuth into Netlify and let Trakkr deploy an Edge Function
  • WordPress Setup - Enable crawler tracking on a connected WordPress site through the Trakkr plugin
  • Next.js self-hosted - Copy the Proxy or middleware snippet from the setup flow and redeploy
  • AWS CloudFront - Copy the Lambda@Edge template and attach it to Origin Request in CloudFront
  • Node / Express - Copy the Express middleware snippet and mount it near the top of your app
  • Nginx / OpenResty - Copy the OpenResty log hook or ship JSON access logs into the webhook
  • Akamai / Fastly / Other - Use the webhook examples for Akamai DataStream, Fastly log streaming, or your own edge forwarder

Webhook runtimes and edge forwarders

If you're on Next.js, CloudFront, Express, Nginx / OpenResty, Akamai, Fastly, or a custom server, you can still get server-side tracking via Trakkr's webhook ingest path.

  1. 1Open CrawlerConnect platform
  2. 2Choose Next.js self-hosted, AWS CloudFront, Node / Express, Nginx / OpenResty, or Akamai / Fastly / Other
  3. 3Trakkr generates a unique webhook URL and bearer token for your brand
  4. 4Configure your runtime, CDN log forwarder, or middleware to POST AI crawler visits to that URL
  5. 5Use the dry-run validation endpoint or the built-in verification step to test before going live
  6. 6Once events are flowing, the connection switches to "Active"

Trakkr ships starter templates for Next.js Proxy, Express middleware, OpenResty log hooks, Lambda@Edge, Akamai DataStream, Fastly log streaming, and a generic webhook example directly in the dashboard.

Tip
Treat the bearer token as a secret. Anyone with the token and the webhook URL can post events to your brand's stream. If a token leaks, rotate it from the connection settings.

Choosing your path

If you...Use
Want the absolute fastest setupTracking pixel
Run a server-rendered or SSG siteServer-side connection
Are behind Cloudflare with no other constraintsCloudflare server-side
Use Vercel or Netlify hostingTheir respective OAuth flow
Self-host on Next.js, Node, or NginxThe matching first-class webhook runtime
Run on Akamai, Fastly, or an unsupported edge stackAkamai / Fastly / Other
Want the most accurate dataAny server-side connection
Have a JavaScript-rendered SPA without SSRServer-side connection (the pixel can still miss bots that don't run JS)

You can connect more than one. Trakkr deduplicates events at ingest, so running the pixel and a Cloudflare connection on the same site is safe.


Verifying your setup

Whichever path you chose, verify it the same way:

  1. 1Open Crawler in the sidebar
  2. 2Click Send Verification in the header
  3. 3Wait ~30 seconds and refresh the Feed

You should see a "Verified ✓" event appear with GPTBot as the bot name. This confirms the entire pipeline (your site → Trakkr's ingest → BigQuery → the dashboard) is working.

If the synthetic event arrives but real crawler events are still empty after 24 hours, the issue is upstream of Trakkr - usually a robots.txt block, a WAF rule, or a DNS misconfiguration. Check the Access tab for findings.


Connection management

Once a connection is live, you can manage it from the Crawler dashboard.

ActionWhat it does
Sync nowManually pull recent visits from the platform
BackfillRe-sync a wider time window (clears the dedup ledger for that period)
View logsSee the last N sync attempts with status, visit counts, and error details
PauseStop syncing without disconnecting (useful during maintenance windows)
DisconnectRemove the connection entirely. Cleans up Vercel Log Drains automatically

Each connection has a health indicator showing Active, Pending, Error, or Paused. Errors include the underlying message - usually expired credentials or a permission change on the platform side.


Troubleshooting

"No crawler data showing" after install

  1. 1Click Send Verification to confirm the pipeline works
  2. 2If verification works but real visits don't appear, check your robots.txt for AI bot blocks
  3. 3Check your CDN's bot management or WAF for rules that might be blocking the bots before they reach your site
  4. 4Wait 24 hours - some AI bots crawl on a weekly cycle and may not have visited yet

"Pixel installed but verification fails"

  • Confirm the script is in the <head>, not the <body>
  • Check that your CDN isn't stripping or rewriting the script tag
  • Open browser dev tools → Network and confirm pixel.trakkr.ai/t.js loads with status 200
  • Try the synthetic verification from an incognito window to rule out extension interference

"Connection shows Error status"

  • Open the connection's logs in the Connections panel
  • Look for "401 Unauthorized" - usually means the platform credentials expired or were revoked
  • For OAuth connections (Vercel, Netlify), reconnect to refresh the token
  • For Cloudflare, regenerate the API token if it has been deleted on the Cloudflare side

"I see verification visits but no real crawls"

  • Open the Access tab and check for blocking findings
  • Look at your robots.txt for Disallow: / rules under AI bot user agents
  • Check your CDN for bot management rules that may be challenging or blocking AI bots

Next steps

AI Crawlers

Read the dashboard - hero stats, the page funnel, and AI insights.

Cloudflare Setup

Connect a Cloudflare zone in under five minutes.

JavaScript Rendering

Make sure AI crawlers can read your client-rendered pages.

Was this helpful?

Press ? for keyboard shortcuts