AI Site Grade

gnawchocolate.co.uk — AI Site Grade

Gnawchocolate.co.uk blocks every AI crawler with a Cloudflare JS challenge, leaving zero content accessible to bots and zero external signals to correct a misaligned LLM knowledge gap.

The site is entirely invisible to AI crawlers due to Cloudflare JS challenge walls, has no structured data, and lacks any external citations, causing LLMs to rely on outdated or incorrect brand information.

Findings
12
Evidence checks
35
Completed
30 May 2026

Analysis

Cloudflare JS Challenge Blocks Every AI Crawler from All Content

Every AI crawler tested — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, ChatGPT-User, OAI-SearchBot, Applebot-Extended, Bytespider, and anthropic-ai — receives a 403 with a Cloudflare JS challenge wall on the homepage, robots.txt, llms.txt, and sitemap.xml. No bot gets a 200 with real content. The site runs on Shopify (hosted at 23.227.38.32 via Cloudflare) and the live server returns only a "Verifying your connection..." shell with ~10 words of visible text. The robots.txt and llms.txt endpoints are equally blocked, meaning there is no machine-readable guidance for crawlers at all.

Cold-Knowledge Gap

The LLM model describes GNAW as a bean-to-bar, single-origin chocolate maker using ethically sourced cacao from Madagascar, Peru, and Tanzania, founded by a couple. The actual site tells a different story. GNAW is a flavour-forward chocolate brand based in Norwich, UK, founded in 2010 by Matt and Teri Legon. The cocoa comes from a single family-owned Colombian supplier, not multiple origins. The product range centres on novelty flavours (New York Cheesecake, Banoffee Pie, Espresso Martini, Sticky Toffee Pudding), hot chocolate spoons, and gift sets — not single-origin tasting bars. The brand positions itself on sustainability (70% solar-powered factory, 99% compostable packaging, no palm oil) and "Farmer to Bar" traceability, not the bean-to-bar craft narrative the model assumes.

Schema and Structured Data Absence

The homepage, about page, sustainability page, blog, and product collection page all contain zero JSON-LD schema of any type. No Organization, Product, BreadcrumbList, FAQPage, or Review schema exists. The site has a 4.87-star average from 925 verified reviews prominently displayed, yet no AggregateRating or Review schema surfaces this to search engines or AI crawlers. The blog has 25 pages of content, but none of it is marked up for AI consumption.

External Signal Void

Web search returns zero indexed results for "gnaw chocolate" across multiple query variations — no press mentions, no Reddit threads, no review sites, no Amazon listings, no awards coverage. The site's own Amazon storefront link exists (from the Wayback snapshot) but is not discoverable via search. The brand has Facebook, Twitter, Instagram, and LinkedIn profiles, but none appear in search results. This total absence of external citations means AI models have no corroborating signals to verify or enrich their knowledge of GNAW beyond what the LLM training data contains — which is already misaligned with the actual brand positioning.

Findings

  1. Cloudflare JS challenge blocks every AI crawler from all content High

    All tested AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, ChatGPT-User, OAI-SearchBot, Applebot-Extended, Bytespider, anthropic-ai) receive a 403 with a Cloudflare JS challenge wall on the homepage, robots.txt, llms.txt, and sitemap.xml. No bot gets a 200 with real content.

    What to change: Configure Cloudflare to allow AI crawler user agents (e.g., GPTBot, ClaudeBot) through the JS challenge, or serve a static HTML version of the site to these bots.

  2. Robots.txt and llms.txt are blocked by Cloudflare High

    The robots.txt and llms.txt endpoints return 403 with a Cloudflare JS challenge, providing no machine-readable guidance for crawlers.

    What to change: Ensure robots.txt and llms.txt are publicly accessible without JS challenges, and include directives for AI crawlers.

  3. Sitemap.xml is blocked by Cloudflare High

    The sitemap.xml endpoint returns 403, preventing crawlers from discovering the site's URL structure.

    What to change: Make sitemap.xml publicly accessible and submit it to search engines.

  4. Zero JSON-LD schema on any page High

    The homepage, about page, sustainability page, blog, and product collection page contain no JSON-LD schema of any type (Organization, Product, BreadcrumbList, FAQPage, Review, AggregateRating).

    What to change: Add JSON-LD schema for Organization, Product, BreadcrumbList, and AggregateRating (with 4.87 stars from 925 reviews) to all relevant pages.

  5. Aggregate rating of 4.87 stars from 925 reviews not marked up High

    The site prominently displays a 4.87-star average from 925 verified reviews, but no AggregateRating or Review schema is present to surface this to search engines or AI crawlers.

    What to change: Add AggregateRating and Review schema to the homepage and product pages.

  6. LLM knowledge misaligned with actual brand positioning High

    The LLM model describes GNAW as a bean-to-bar, single-origin chocolate maker with cacao from multiple origins, but the actual site is a flavour-forward brand using a single Colombian supplier, founded in 2010 by Matt and Teri Legon. This misalignment is due to lack of accessible content for AI crawlers.

    What to change: Allow AI crawlers to access the site and add structured data to correct the LLM knowledge gap.

  7. Zero external citations in search results High

    Web search returns no indexed results for the brand across multiple query variations, including press mentions, reviews, social media, or awards. This absence of external signals means AI models have no corroborating data to verify or enrich brand knowledge.

    What to change: Build external citations through PR, reviews, social media engagement, and backlinks to improve discoverability.

  8. No pages indexed in Google High

    The site:gnawchocolate.co.uk search returns zero results, indicating no pages are indexed by Google.

    What to change: Ensure the site is crawlable by Googlebot and submit a sitemap via Google Search Console.

  9. Homepage returns only a JS challenge shell High

    The live homepage returns a 403 with a 'Verifying your connection...' message and approximately 10 words of visible text, providing no content to crawlers.

    What to change: Configure Cloudflare to serve a static HTML version of the site to bots that cannot execute JavaScript.

  10. No llms.txt file available Medium

    The llms.txt endpoint returns a 403, meaning there is no machine-readable file to guide AI crawlers on which content to use.

    What to change: Create an llms.txt file with a summary of the site and links to key pages.

  11. Blog content not marked up for AI consumption Medium

    The blog has 25 pages of content but no structured data or semantic markup to help AI crawlers understand the content.

    What to change: Add Article or BlogPosting schema to blog posts.

  12. Product pages lack Product schema Medium

    The product collection page and likely individual product pages have no Product schema, missing opportunities for rich results.

    What to change: Add Product schema with name, description, price, and availability to all product pages.

What's working

  • Wayback Machine snapshots preserve historical content — Multiple Wayback Machine snapshots of the site are available, showing that content existed in the past and can be referenced.
  • Wayback snapshots reveal rich product and brand content — Snapshots show detailed product descriptions, sustainability information, and brand story, indicating the site has valuable content that could be made accessible.
  • High customer rating of 4.87 stars from 925 reviews — The site displays a strong average rating, which is a positive social proof signal if marked up with schema.
  • Sustainability commitment documented on site — The site details its 70% solar-powered factory, 99% compostable packaging, and no palm oil policy, which are positive brand attributes.
  • Blog with 25 pages of content — The blog contains substantial content that could be leveraged for AI visibility if made crawlable and marked up.
  • Brand has social media profiles on multiple platforms — The brand maintains Facebook, Twitter, Instagram, and LinkedIn profiles, which could be used to build external signals.
  • Amazon storefront exists — The site has an Amazon storefront link, indicating an additional sales channel that could be leveraged for reviews and backlinks.

Track gnawchocolate.co.uk across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand