AI Site Grade
willysacv.com — AI Site Grade
Willysacv.com returns HTTP 403 with a Cloudflare JS challenge to all 11 tested AI crawlers, blocking every bot from accessing any content.
The site is completely invisible to AI crawlers due to a Cloudflare challenge, has zero external signals, and lacks all structured data.
- Findings
- 10
- Evidence checks
- 37
- Completed
- 30 May 2026
Analysis
Cloudflare Challenge Blocks Every AI Crawler from a Live Shopify Store
The live site at willysacv.com returns HTTP 403 with a Cloudflare JS challenge to every visitor — including all 11 tested AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, ChatGPT-User, OAI-SearchBot, Applebot-Extended, Bytespider, anthropic-ai, Perplexity-User, and a standard browser). No bot receives a single byte of real content. The robots.txt and llms.txt endpoints also return the same Cloudflare challenge page, making them unreachable. DNS records point to Shopify (23.227.38.65), and the Wayback Machine reveals the actual site: a Shopify-powered UK brand selling raw, organic apple cider vinegar with live Mother culture, founded by Will Chase on a Herefordshire farm with 300-year-old orchards.
Cold-Knowledge Gap
The LLM knows the brand as a historical military vehicle manufacturer — Willys ACV (Air-Cooled Vehicle), the WWII Jeep prototype. It has zero awareness of the modern apple cider vinegar business. The second query with the full brand context ("Willy's ACV apple cider vinegar UK") returns accurate product details (founder Will Chase, raw ACV with Mother, Herefordshire farm), but this knowledge is not grounded in the live site — it likely comes from archived or third-party mentions. The gap between what the LLM knows cold (a defunct auto division) and what the site actually sells (a living probiotic vinegar) is total.
External Signal Void
No search results surface for willysacv.com, "Willy's ACV", "Willys ACV", or "Will Chase" apple cider vinegar across general web search, Reddit, or social media. The site references Instagram and Facebook handles (@willysacv) and a testimonial from Jessie Inchauspé (Glucose Goddess), but none of these external signals appear in search engine indexes. The brand exists in a near-complete off-domain vacuum — no reviews, no press coverage, no forum discussions are discoverable.
Schema and Content Posture
The homepage and FAQ page contain zero JSON-LD schema of any type — no Product, Organization, FAQPage, BreadcrumbList, or WebSite markup. The FAQ page uses a heading-based accordion structure with 20+ questions and answers, which is a strong candidate for FAQPage schema but remains unmarked. The site has rich answer-signal content (tables, lists, usage instructions, testimonials, a "60 day challenge" call-to-action), but none of it is structured for AI consumption. The llms.txt file 404s behind the Cloudflare wall.
Findings
Cloudflare JS challenge blocks all 11 tested AI crawlers High
The live site returns HTTP 403 with a Cloudflare JS challenge to every visitor, including all 11 tested AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, ChatGPT-User, OAI-SearchBot, Applebot-Extended, Bytespider, anthropic-ai, Perplexity-User, and a standard browser). No bot receives any real content.
What to change: Configure Cloudflare to allow AI crawlers by removing the JS challenge for known bot user agents or using a firewall rule to bypass the challenge for verified crawlers.
Robots.txt and llms.txt endpoints return Cloudflare challenge High
The robots.txt and llms.txt endpoints also return the same Cloudflare challenge page, making them unreachable for crawlers and preventing any directives from being read.
What to change: Ensure robots.txt and llms.txt are served without a JS challenge so crawlers can read them.
LLM cold knowledge identifies brand as historical military vehicle manufacturer High
When queried with 'Willys ACV', the LLM returns information about Willys-Overland Motors' air-cooled vehicle division (WWII Jeep prototype), not the modern apple cider vinegar brand. This total mismatch means AI assistants cannot correctly answer basic questions about the site's products.
What to change: Build external signals (press, reviews, social media) and structured data to help LLMs associate the brand name with the correct product category.
Zero external signals found across web search, Reddit, and social media High
No search results surface for the domain, brand name, or founder across general web search, Reddit, Instagram, or Facebook. The brand has no discoverable reviews, press coverage, or forum discussions.
What to change: Actively build off-domain presence through PR, influencer partnerships, customer reviews, and social media engagement to create discoverable external signals.
No JSON-LD schema of any type on homepage or FAQ page High
The homepage and FAQ page contain zero JSON-LD schema markup — no Product, Organization, FAQPage, BreadcrumbList, or WebSite schema. The FAQ page has 20+ questions in a heading-based accordion structure that is ideal for FAQPage schema but remains unmarked.
What to change: Add JSON-LD schema for Product, Organization, FAQPage, and WebSite to all relevant pages.
llms.txt file returns 404 behind Cloudflare wall Medium
The llms.txt endpoint returns a 404 error (behind the Cloudflare challenge), so AI assistants cannot discover a curated summary of the site's content.
What to change: Create and serve a valid llms.txt file with a summary of the site's content and key pages.
Domain not indexed in any search engine High
No search results for site:willysacv.com or any brand-related queries, indicating the site is not indexed by search engines, likely due to the Cloudflare challenge.
What to change: Remove the Cloudflare challenge for search engine bots to allow indexing.
Founder Will Chase not discoverable in search results Medium
No search results surface for 'Will Chase' in connection with apple cider vinegar, despite the site mentioning the founder's story.
What to change: Encourage the founder to build a personal brand presence (LinkedIn, interviews, guest posts) to create discoverable signals.
Instagram and Facebook handles not indexed in search Medium
No search results for the brand's Instagram or Facebook handles, suggesting social media profiles are either not public or not optimized for search.
What to change: Ensure social media profiles are public, use consistent branding, and include relevant keywords in bios and posts.
Glucose Goddess testimonial not discoverable externally Medium
The site references a testimonial from Jessie Inchauspé (Glucose Goddess), but no external search results confirm this association, reducing credibility for AI assistants.
What to change: Encourage the influencer to share the testimonial on their own channels or create a press release to make the association discoverable.
What's working
- FAQ page contains 20+ questions with detailed answers — The FAQ page has a heading-based accordion structure with 20+ questions and answers covering product usage, ingredients, and benefits, which is strong content for AI consumption once schema is added.
- Brand story page provides detailed origin narrative — The 'Willy's Story' page describes the founder's background, the Herefordshire farm, and 300-year-old orchards, offering rich narrative content for AI to reference.
- Homepage includes product details and call-to-action — The homepage describes the organic apple cider vinegar with live mother, includes a '60 day challenge' CTA, and lists usage instructions, providing key product information.
- LLM knows brand details when given full context — When queried with 'Willy's ACV apple cider vinegar UK', the LLM returns accurate product details (founder Will Chase, raw ACV with Mother, Herefordshire farm), indicating some prior knowledge exists in training data.
- Wayback Machine has recent snapshot of the site — A Wayback Machine snapshot from 2026-03-05 captures the full homepage, FAQ, and story pages, providing an accessible archive of the site's content.
Track willysacv.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.