Crawler Tracking
Connect Cloudflare, Vercel, Netlify, Next.js, CloudFront, WordPress, Node, Nginx, or webhook-based edge stacks to send AI crawler visits into Trakkr - or install the lightweight tracking pixel.
- Pick the right capture path for your site - tracking pixel or server-side connection
- Connect Cloudflare, Vercel, Netlify, Next.js, AWS CloudFront, WordPress, Node / Express, Nginx / OpenResty, or Akamai / Fastly / other webhook sources
- Verify the pipeline works in 30 seconds with the synthetic verification ping
- Backfill historical visits automatically when you connect a server-side platform
This page is the install hub. If you want to understand what crawler tracking is and how to read your data, start with AI Crawlers. If you're here to wire it up, you're in the right place.
There are two paths to get crawler data into Trakkr. You only need one, but you can run both - Trakkr deduplicates events by hash so you won't see double counts.
Path 1: Tracking pixel
The fastest setup. A small JavaScript snippet that runs in your visitors' browsers, detects AI bot user agents, and forwards them to Trakkr. Two minutes from copy to verified.
Best for: marketing sites, Webflow, Framer, Carrd, simple WordPress sites, anywhere you can drop a script tag in the <head>.
Install in three steps
- 1Open Crawler in the sidebar
- 2In the empty state, click Install tracking to reveal your unique snippet
- 3Paste the snippet into your site's
<head>element and deploy
<script async src="https://pixel.trakkr.ai/t.js" data-id="YOUR_TRACKING_ID"></script>The pixel is under 1 KB and asynchronous - it won't affect your page load time.
Verify the install
In Trakkr, click Send Verification. This fetches your homepage with a GPTBot user agent and confirms the detection pipeline is healthy. You'll see a "Verified ✓" event in the Feed within about 30 seconds.
If the verification event doesn't appear:
- Make sure you deployed your site after pasting the snippet
- Check that your CDN (Cloudflare, Fastly) isn't stripping the script tag
- Confirm the snippet is in the
<head>, not the<body>
What the pixel can and can't see
The pixel runs in the browser, so it only catches crawlers that execute JavaScript. Most modern AI bots do, but a few don't - and any bot that gets blocked at the WAF before reaching your page will never trigger the pixel.
If you want to catch every visit (including bots that don't run JS, or bots that get rate-limited at the edge), use a server-side connection instead.
Path 2: Server-side platform connections
Server-side integrations read from your CDN, host, or log drain directly. They see every request that hits your origin - including non-JS bots, blocked bots, and 404s on URLs that no longer exist.
| Platform | Auth | Realtime | Plan requirements |
|---|---|---|---|
| Cloudflare | API token | No | All Cloudflare plans |
| Vercel | OAuth | Yes | Vercel Pro or Enterprise |
| Netlify | OAuth | Yes | All Netlify plans |
| Next.js self-hosted | Webhook | Yes | Any self-hosted Next.js deployment |
| AWS CloudFront | Webhook | Yes | Lambda@Edge on any CloudFront distribution |
| WordPress | Existing adapter | No | Trakkr WordPress plugin |
| Node / Express | Webhook | Yes | Any Node or Express server |
| Nginx / OpenResty | Webhook | Yes | OpenResty or nginx with a log shipper |
| Akamai / Fastly / Other | Webhook | Yes | Any CDN or edge stack that can POST visits |
Dedicated guides exist for Cloudflare, Vercel, Netlify, and WordPress. The other webhook-based runtimes are configured directly in the in-app setup flow with copy-paste templates:
- Cloudflare Setup - Create a scoped read-only API token in Cloudflare and paste it into Trakkr
- Vercel Setup - OAuth into Vercel and let Trakkr install a Log Drain
- Netlify Setup - OAuth into Netlify and let Trakkr deploy an Edge Function
- WordPress Setup - Enable crawler tracking on a connected WordPress site through the Trakkr plugin
- Next.js self-hosted - Copy the Proxy or middleware snippet from the setup flow and redeploy
- AWS CloudFront - Copy the Lambda@Edge template and attach it to Origin Request in CloudFront
- Node / Express - Copy the Express middleware snippet and mount it near the top of your app
- Nginx / OpenResty - Copy the OpenResty log hook or ship JSON access logs into the webhook
- Akamai / Fastly / Other - Use the webhook examples for Akamai DataStream, Fastly log streaming, or your own edge forwarder
Webhook runtimes and edge forwarders
If you're on Next.js, CloudFront, Express, Nginx / OpenResty, Akamai, Fastly, or a custom server, you can still get server-side tracking via Trakkr's webhook ingest path.
- 1Open Crawler → Connect platform
- 2Choose Next.js self-hosted, AWS CloudFront, Node / Express, Nginx / OpenResty, or Akamai / Fastly / Other
- 3Trakkr generates a unique webhook URL and bearer token for your brand
- 4Configure your runtime, CDN log forwarder, or middleware to POST AI crawler visits to that URL
- 5Use the dry-run validation endpoint or the built-in verification step to test before going live
- 6Once events are flowing, the connection switches to "Active"
Trakkr ships starter templates for Next.js Proxy, Express middleware, OpenResty log hooks, Lambda@Edge, Akamai DataStream, Fastly log streaming, and a generic webhook example directly in the dashboard.
Choosing your path
| If you... | Use |
|---|---|
| Want the absolute fastest setup | Tracking pixel |
| Run a server-rendered or SSG site | Server-side connection |
| Are behind Cloudflare with no other constraints | Cloudflare server-side |
| Use Vercel or Netlify hosting | Their respective OAuth flow |
| Self-host on Next.js, Node, or Nginx | The matching first-class webhook runtime |
| Run on Akamai, Fastly, or an unsupported edge stack | Akamai / Fastly / Other |
| Want the most accurate data | Any server-side connection |
| Have a JavaScript-rendered SPA without SSR | Server-side connection (the pixel can still miss bots that don't run JS) |
You can connect more than one. Trakkr deduplicates events at ingest, so running the pixel and a Cloudflare connection on the same site is safe.
Verifying your setup
Whichever path you chose, verify it the same way:
- 1Open Crawler in the sidebar
- 2Click Send Verification in the header
- 3Wait ~30 seconds and refresh the Feed
You should see a "Verified ✓" event appear with GPTBot as the bot name. This confirms the entire pipeline (your site → Trakkr's ingest → BigQuery → the dashboard) is working.
If the synthetic event arrives but real crawler events are still empty after 24 hours, the issue is upstream of Trakkr - usually a robots.txt block, a WAF rule, or a DNS misconfiguration. Check the Access tab for findings.
Connection management
Once a connection is live, you can manage it from the Crawler dashboard.
| Action | What it does |
|---|---|
| Sync now | Manually pull recent visits from the platform |
| Backfill | Re-sync a wider time window (clears the dedup ledger for that period) |
| View logs | See the last N sync attempts with status, visit counts, and error details |
| Pause | Stop syncing without disconnecting (useful during maintenance windows) |
| Disconnect | Remove the connection entirely. Cleans up Vercel Log Drains automatically |
Each connection has a health indicator showing Active, Pending, Error, or Paused. Errors include the underlying message - usually expired credentials or a permission change on the platform side.
Troubleshooting
"No crawler data showing" after install
- 1Click Send Verification to confirm the pipeline works
- 2If verification works but real visits don't appear, check your
robots.txtfor AI bot blocks - 3Check your CDN's bot management or WAF for rules that might be blocking the bots before they reach your site
- 4Wait 24 hours - some AI bots crawl on a weekly cycle and may not have visited yet
"Pixel installed but verification fails"
- Confirm the script is in the
<head>, not the<body> - Check that your CDN isn't stripping or rewriting the script tag
- Open browser dev tools → Network and confirm
pixel.trakkr.ai/t.jsloads with status 200 - Try the synthetic verification from an incognito window to rule out extension interference
"Connection shows Error status"
- Open the connection's logs in the Connections panel
- Look for "401 Unauthorized" - usually means the platform credentials expired or were revoked
- For OAuth connections (Vercel, Netlify), reconnect to refresh the token
- For Cloudflare, regenerate the API token if it has been deleted on the Cloudflare side
"I see verification visits but no real crawls"
- Open the Access tab and check for blocking findings
- Look at your
robots.txtforDisallow: /rules under AI bot user agents - Check your CDN for bot management rules that may be challenging or blocking AI bots
Next steps
AI Crawlers
Read the dashboard - hero stats, the page funnel, and AI insights.
Cloudflare Setup
Connect a Cloudflare zone in under five minutes.
JavaScript Rendering
Make sure AI crawlers can read your client-rendered pages.
Was this helpful?
