AI Site Grade
cycleexchange.co.uk — AI Site Grade
Cycle Exchange's Cloudflare JS challenge blocks all AI crawlers, making its sophisticated agent-commerce documentation and entire inventory invisible to LLMs.
Cycle Exchange has a comprehensive llms.txt and robots.txt for AI agents, but every page returns a 403 Cloudflare challenge, blocking all crawlers and leaving LLMs with outdated, inaccurate knowledge of the brand.
- Findings
- 12
- Evidence checks
- 38
- Completed
- 30 May 2026
Analysis
Cycle Exchange is a Shopify-hosted premium pre-owned bike retailer based in Kingston upon Thames, UK — but every page beyond the homepage is locked behind a Cloudflare JS challenge that blocks all AI crawlers, all search bots, and even browser UAs from programmatic access. The site has an impressively detailed llms.txt and robots.txt with UCP/MCP agent-commerce instructions, yet none of those endpoints are actually reachable by the agents they're written for. The cold LLM knowledge describes the brand as a "marketplace similar to eBay" — a fundamental positioning error that the site's own content contradicts, but which no AI can correct because it cannot crawl the inventory.
Crawler Access
Every AI bot — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Bytespider, Applebot-Extended — receives a 403 Cloudflare JS challenge page on the homepage and every sub-page. The robots.txt uses the generic * wildcard with Allow: / and no AI-specific directives, but the Cloudflare "Under Attack" or bot-fight mode overrides it entirely. The sitemap.xml, agents.md, llms.txt, and /.well-known/ucp all return the same Cloudflare challenge wall. The site is effectively invisible to every AI crawler despite having the most sophisticated agent-commerce documentation of any Shopify store examined.
Cold-Knowledge Gap
The LLM describes Cycle Exchange as "a UK-based online marketplace for buying and selling second-hand bicycles... operates similarly to eBay." The actual site is a retailer — it buys, refurbishes, warranties, and sells premium bikes (Pinarello, Specialized, Trek) with a 12-month warranty, Cytech-accredited mechanics, and white-glove hand delivery. The model has no knowledge of the Kingston storefront, the workshop/cafe, the Friday Chat Laps community rides, the Dani Rowe partnership, or the Tour of Britain official partnership. The gap between "eBay-like marketplace" and "certified refurbished premium bike retailer" is the single largest AI-visibility problem.
Schema Posture
The homepage carries Organization, WebSite, and BreadcrumbList schema — but no Product schema, no FAQPage schema despite FAQ content, and no LocalBusiness schema despite having a physical store at 27 Sury Basin, Kingston. The Wayback snapshot from 2022 shows the same sparse schema. Product pages (which are the core inventory) are entirely uncrawlable, so any product schema they may contain is invisible to AI.
External Signals
The site links to Trustpilot on the homepage, but Trustpilot's own page blocks programmatic access. Web search returned zero indexed results for the domain across multiple queries — suggesting Google may have already deindexed or deprioritized the site due to the Cloudflare wall. The Wayback Machine shows the site has existed since at least 2018 with consistent content, but the current Cloudflare configuration has effectively erased it from the AI-accessible web.
Findings
Cloudflare JS challenge blocks all AI crawlers from every page High
Every page on cycleexchange.co.uk, including the homepage, product pages, collections, sitemap.xml, llms.txt, and agents.md, returns a 403 Cloudflare JS challenge page to all AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, etc.). The robots.txt allows all bots, but Cloudflare's bot-fight mode overrides it, making the site completely invisible to AI.
What to change: Disable Cloudflare's bot-fight mode or configure it to allow known AI crawler user agents (GPTBot, ClaudeBot, PerplexityBot, etc.) to access the site. Ensure the Cloudflare WAF does not challenge these bots.
llms.txt and agents.md are unreachable by AI agents High
The site hosts a detailed llms.txt (4334 bytes) and an agents.md file, but both return 403 Cloudflare challenge pages when accessed by AI crawlers. The agent-commerce documentation is effectively useless because no agent can retrieve it.
What to change: Ensure llms.txt and agents.md are served without Cloudflare challenge to known AI crawler user agents. These files should be publicly accessible.
Cold LLM knowledge misrepresents Cycle Exchange as an eBay-like marketplace High
LLMs describe Cycle Exchange as a 'UK-based online marketplace for buying and selling second-hand bicycles... operates similarly to eBay.' The actual business is a premium pre-owned bike retailer with a physical store, workshop, warranties, and community events. The inaccurate positioning stems from the site being uncrawlable, so LLMs rely on outdated or generic descriptions.
What to change: Make the site crawlable by AI bots so LLMs can index accurate content about the business model, services, and location.
Missing LocalBusiness schema for physical store Medium
The homepage includes Organization, WebSite, and BreadcrumbList schema but lacks LocalBusiness schema despite having a physical store at 27 Sury Basin, Kingston. This prevents AI from associating the site with its brick-and-mortar location.
What to change: Add LocalBusiness schema markup to the homepage with the store address, phone, opening hours, and relevant business details.
No Product schema on homepage despite inventory focus Medium
The homepage does not include Product schema, even though the site's core offering is pre-owned bikes. Product pages are uncrawlable, so any product schema they may contain is invisible to AI.
What to change: Add Product schema to the homepage highlighting featured or representative inventory, and ensure product pages are crawlable to surface their schema.
Missing FAQPage schema despite FAQ content Low
The homepage contains FAQ content but does not use FAQPage schema markup, reducing the chance of appearing in AI-generated answers or rich snippets.
What to change: Add FAQPage schema to the FAQ section on the homepage.
Sitemap.xml returns 403 to all crawlers High
The sitemap.xml is blocked by Cloudflare, returning a 403 challenge page to both browsers and AI bots. This prevents search engines and AI crawlers from discovering the site's URL structure.
What to change: Ensure sitemap.xml is publicly accessible without Cloudflare challenge. It should be served to all user agents.
Zero indexed pages in web search results High
Multiple web searches for the domain and brand returned zero results, indicating that Google and other search engines have likely deindexed or deprioritized the site due to the Cloudflare wall.
What to change: Remove the Cloudflare challenge for search engine bots (Googlebot, Bingbot) to allow reindexation. Submit a new sitemap to Google Search Console.
Trustpilot review page blocked by Cloudflare Medium
The Trustpilot review page for cycleexchange.co.uk returns a 403 Cloudflare challenge, preventing AI crawlers from accessing third-party reviews that could validate the business.
What to change: This is a Trustpilot-side issue; contact Trustpilot to ensure the review page is accessible to AI crawlers, or encourage customers to leave reviews on platforms that allow crawling.
Robots.txt lacks AI-specific directives Low
The robots.txt uses a generic wildcard rule with Allow: / and does not include specific directives for AI crawlers like GPTBot, ClaudeBot, or PerplexityBot. While the Cloudflare block is the primary issue, the robots.txt does not explicitly guide AI agents.
What to change: Add explicit Allow or Disallow rules for known AI crawler user agents in robots.txt to complement the Cloudflare configuration.
UCP endpoint (.well-known/ucp) blocked Medium
The Universal Commerce Protocol endpoint at /.well-known/ucp returns a 403 Cloudflare challenge, preventing AI agents from discovering agent-commerce capabilities.
What to change: Ensure the .well-known/ucp endpoint is accessible to AI crawlers without Cloudflare challenge.
llms.txt not referenced in robots.txt Low
The robots.txt does not include a Sitemap directive pointing to llms.txt, which is a recommended practice for AI agent discovery. While llms.txt exists, agents cannot find it via robots.txt.
What to change: Add a line like 'Sitemap: https://www.cycleexchange.co.uk/llms.txt' to robots.txt to help AI agents discover the llms.txt file.
What's working
- Comprehensive llms.txt with agent-commerce instructions — The site hosts a 4334-byte llms.txt file that provides detailed instructions for AI agents, including sections for summary, preferred links, and agent-commerce capabilities. This is a sophisticated implementation that, if accessible, would greatly enhance AI visibility.
- Agents.md file for AI agent documentation — The site includes an agents.md file intended to document agent-specific endpoints and capabilities, showing proactive effort to support AI agent integration.
- Homepage includes Organization, WebSite, and BreadcrumbList schema — The homepage uses structured data markup for Organization, WebSite, and BreadcrumbList, providing basic entity information to search engines and AI.
- Robots.txt allows all crawlers with Allow: / — The robots.txt file uses a permissive wildcard rule (Allow: /) for all user agents, indicating an intention to allow crawling. The Cloudflare block overrides this, but the robots.txt itself is not restrictive.
- Site has long-standing web presence since at least 2018 — Wayback Machine snapshots show the site has been active since at least 2018 with consistent content, indicating a stable and established online presence.
- Homepage contains rich content about services and community — The homepage includes detailed information about the business, including warranty, Cytech-accredited mechanics, community rides, and partnerships, which would be valuable for AI if crawlable.
- UCP endpoint (.well-known/ucp) is present — The site hosts a Universal Commerce Protocol endpoint at /.well-known/ucp, demonstrating awareness of agent-commerce standards, even though it is currently blocked.
- Homepage links to Trustpilot for reviews — The homepage includes a link to the Trustpilot review page, indicating an effort to build external trust signals, though the Trustpilot page itself is blocked.
Track cycleexchange.co.uk across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.