AI Site Grade

lucyandyak.com — AI Site Grade

Lucy & Yak's Shopify storefront behind Cloudflare JS challenge wall blocks every AI crawler, making the live site invisible to GPTBot, ClaudeBot, and all other bots tested.

Lucy & Yak's live site returns a 403 Cloudflare JS challenge to all AI crawlers, preventing any content from being indexed or verified, despite having an Anthropic domain verification record.

Findings
11
Evidence checks
27
Completed
30 May 2026

Analysis

Lucy & Yak: A Shopify store rendered invisible to every AI crawler

The live site at lucyandyak.com returns a 403 Cloudflare JS challenge wall to every single user-agent tested — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Applebot-Extended, Bytespider, and even a standard browser — making it one of the most aggressively locked-down ecommerce domains encountered. No AI crawler can read a single word of product copy, pricing, or brand story from the live origin.

Crawler Access

Every bot in the compare_bot_access test received a 403 with Cloudflare's "Verifying your connection" JS challenge page. The response body is identical across all agents: ~8.8 KB of HTML containing only the Cloudflare challenge script and multi-language "Your connection needs to be verified" text. No robots.txt is served (also 403), no llms.txt exists (also 403), and the sitemap is unreachable. The DNS records confirm a Shopify storefront (A record 23.227.38.65, Shopify verification TXT records) behind Cloudflare (NS: angelina.ns.cloudflare.com, plato.ns.cloudflare.com), with an anthropic-domain-verification TXT record present — suggesting the brand has engaged with Anthropic's publisher program, yet ClaudeBot still gets blocked at the edge.

Cold-Knowledge Gap

The LLM's prior knowledge about Lucy & Yak is surprisingly detailed: founded in 2017 by Lucy Greenwood and Chris Yates, UK-based, signature dungarees, GOTS-certified organic cotton, ethical factories in India, a "Repair, Reuse, Recycle" program, and a 2023 sale-email controversy. This knowledge was acquired entirely from off-domain sources — the live site contributed nothing. The gap is not about missing information but about verification: an AI engine retrieving the live site to confirm or enrich this prior would find a blank wall. The brand's own site is the least authoritative source about itself.

Schema Posture

The archived homepage (January 2025) contains only two schema types: Organization and WebSite. The Organization block lists social profiles (Twitter, Facebook, Instagram, TikTok) but has empty sameAs entries for several slots. No Product schema, no BreadcrumbList, no FAQPage, no ItemList on collection pages. The dungarees collection page (3,840 words of product data) carries only a generic Organization schema — individual products lack structured data entirely. This is a Shopify theme default with minimal customization.

External Signals

The brand has a strong off-domain footprint that the cold LLM already captures: festival-origin story, organic cotton positioning, physical stores in 8 UK cities (Brighton, Bristol, Norwich, Nottingham, Cambridge, Manchester, Cardiff, Exeter, Castleford). The blog contains posts on sustainability, collaborations, and events. However, the live site's Cloudflare wall means any external link — from press, social media, or influencer content — leads AI crawlers to a dead end, fragmenting the brand's citation trail.

Findings

  1. Cloudflare JS challenge blocks all AI crawlers High

    The live site returns a 403 Cloudflare JS challenge wall to every user-agent tested, including GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Applebot-Extended, and Bytespider. No AI crawler can access any content from the live origin.

    What to change: Configure Cloudflare to allow AI crawler user-agents (e.g., GPTBot, ClaudeBot) through the JS challenge, or serve a static HTML version to bots.

  2. Robots.txt is unreachable (403) High

    The robots.txt file returns a 403 error, preventing crawlers from understanding which paths are allowed or disallowed.

    What to change: Ensure robots.txt is publicly accessible and includes directives for AI crawlers.

  3. No llms.txt file available Medium

    The llms.txt file returns a 403, so AI crawlers cannot discover a curated set of URLs for LLM consumption.

    What to change: Create an llms.txt file listing key pages (home, about, collections) for AI crawlers.

  4. Sitemap is unreachable Medium

    The sitemap.xml is not accessible due to the Cloudflare wall, preventing crawlers from discovering all pages.

    What to change: Ensure sitemap.xml is publicly accessible and submitted to search engines.

  5. No Product schema on collection pages High

    The archived dungarees collection page (3,840 words of product data) contains only a generic Organization schema. Individual products lack structured data, reducing visibility in AI-generated answers.

    What to change: Add Product schema markup to each product on collection and product pages, including name, description, price, and availability.

  6. Empty sameAs entries in Organization schema Low

    The Organization schema on the homepage lists social profiles but has empty sameAs entries for several slots, which may confuse parsers.

    What to change: Populate all sameAs fields with valid URLs or remove empty entries.

  7. Missing BreadcrumbList and FAQPage schema Medium

    The site does not use BreadcrumbList or FAQPage schema, which are commonly used by AI engines to understand site structure and answer questions.

    What to change: Add BreadcrumbList schema to all pages and FAQPage schema to relevant content pages.

  8. Live site contributes no content to LLM knowledge High

    The LLM's prior knowledge about Lucy & Yak (founded 2017, organic cotton, etc.) comes entirely from off-domain sources. The live site's Cloudflare wall means AI engines cannot verify or enrich this information from the brand's own site.

    What to change: Allow AI crawlers through Cloudflare to enable content indexing and verification.

  9. External links lead to dead end for AI crawlers High

    Any external link from press, social media, or influencer content leads AI crawlers to a Cloudflare challenge wall, fragmenting the brand's citation trail and preventing AI engines from associating the brand with its own site.

    What to change: Allow AI crawlers to access the site so that external links can be followed and indexed.

  10. Anthropic domain verification present but ClaudeBot still blocked Medium

    The DNS includes an anthropic-domain-verification TXT record, indicating engagement with Anthropic's publisher program, yet ClaudeBot receives a 403 JS challenge. The verification is not being honored at the edge.

    What to change: Configure Cloudflare to allow ClaudeBot through the JS challenge, or whitelist Anthropic's crawler IP ranges.

  11. No known URLs discovered via search or sitemap Medium

    Web searches for the domain and brand returned zero results, and the sitemap is unreachable, leaving no public URL inventory for AI crawlers.

    What to change: Ensure the site is indexed by search engines and that a sitemap is publicly accessible.

What's working

  • Detailed LLM prior knowledge about the brand — The LLM has rich prior knowledge about Lucy & Yak, including founding story, product details, and ethical practices, sourced from off-domain references. This provides a foundation for AI visibility if the live site becomes accessible.
  • Archived content available via Wayback Machine — The Wayback Machine has snapshots of the homepage, about page, blog, and collection pages from January 2025, preserving some content for historical reference.
  • Organization schema present on homepage — The homepage includes an Organization schema with social profile links, providing basic structured data about the brand.
  • Engagement with Anthropic publisher program — The DNS includes an anthropic-domain-verification TXT record, indicating the brand has taken steps to participate in Anthropic's publisher program, which could improve AI visibility once access is resolved.
  • Shopify storefront with Cloudflare protection — The site uses Shopify as its ecommerce platform and Cloudflare for security, which are industry-standard tools that can be configured to allow AI crawlers while maintaining security.

Track lucyandyak.com across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand