AI Site Grade
haleyauto.com — AI Site Grade
Haleyauto.com's Akamai WAF blocks every major AI crawler except anthropic-ai, creating a selective backdoor that leaves the site invisible to most AI systems.
Haleyauto.com is effectively invisible to AI crawlers due to an Akamai WAF that blocks all major bots except anthropic-ai, combined with missing critical pages, zero external discoverability, and no structured data.
- Findings
- 9
- Evidence checks
- 58
- Completed
- 30 May 2026
Analysis
Akamai Blocks Every AI Crawler Except Anthropic's — A Selective Backdoor
The site at haleyauto.com is a multi-franchise dealership group (Buick, GMC, Chrysler, Chevrolet, Ford, Dodge, Jeep, Ram, Toyota, Volvo) in North Chesterfield, VA, but its Akamai WAF configuration creates a bizarre two-tier access system: every major AI crawler (GPTBot, Google-Extended, ClaudeBot, PerplexityBot, ChatGPT-User, OAI-SearchBot) receives a 403 from Akamai, while anthropic-ai gets a 200 served directly from nginx with full page content. This is not a robots.txt policy — it is a WAF-level access control that selectively admits only one AI crawler while blocking all others, including Anthropic's own ClaudeBot and Claude-User.
Crawler Access
The robots.txt file (3,219 bytes, served only to anthropic-ai) contains explicit Disallow rules for GPTBot, OAI-SearchBot, ChatGPT-User, Claude-User, Claude-SearchBot, and PerplexityBot — but only for static assets (/*.js, /*.css, /*.json) and API paths. The homepage, inventory pages, and contact page are not disallowed. However, the Akamai WAF at errors.edgesuite.net overrides this: every bot except anthropic-ai hits a 403 wall before reaching the nginx origin server. The anthropic-ai user-agent bypasses Akamai entirely (server header shows nginx, not AkamaiGHost) and receives full HTML pages. The llms.txt file exists (643 KB, one of the largest observed) and is also served only to anthropic-ai — other bots get 403 on that URL too.
Content & Schema
The homepage title reads "North Chesterfield's Haley Automotive Group" and serves a DDC (Dealer.com) platform site with heavy JavaScript dependencies. The about-us.htm and service/index.htm and finance/index.htm all return 404 errors — critical pages a dealership needs for AI knowledge extraction simply do not exist. The contact.htm page works. The sitemap.xml (422 KB, thousands of URLs) lists last-modified dates of 2026-05-30, a date nearly a year in the future, suggesting a platform-level date injection bug. No JSON-LD schema was detected on any fetched page — the site relies entirely on Open Graph meta tags and HTML meta descriptions for semantic signals.
Cold-Knowledge Gap
The LLM model queried about "Haley Automotive Group North Chesterfield Virginia" described a Ford, Lincoln, and Mazda dealership — but the actual site sells Buick, GMC, Chrysler, Chevrolet, Ford, Dodge, Jeep, Ram, Toyota, and Volvo, with no mention of Lincoln or Mazda. The model also placed the group in "central Virginia" and described it as "family-operated for decades" — claims that cannot be verified from the site since the about page is a 404. The model recalled "mixed" reviews on Google and DealerRater, but no external search results for the brand returned any results at all — no reviews, no press, no Reddit threads, no dealer listings. The brand has effectively zero discoverable external footprint.
External Signals
Searches for "Haley Automotive Group," "Haley Auto," and "haleyauto.com" across multiple queries returned zero indexed results from any search engine or review platform. The domain's DNS resolves (A record at 64.70.56.99, nameservers at GoDaddy, MX at Proofpoint), but the site has no visible presence in search results, no backlinks, no reviews, no social media mentions detectable via web search. The Wayback Machine shows a snapshot from December 2022, but no recent captures — the site may have been offline or behind the Akamai wall for an extended period.
Findings
Akamai WAF blocks all major AI crawlers except anthropic-ai High
The Akamai WAF at errors.edgesuite.net returns 403 for GPTBot, Google-Extended, ClaudeBot, PerplexityBot, ChatGPT-User, and OAI-SearchBot, while anthropic-ai bypasses Akamai and receives full HTML from nginx. This selective access prevents most AI systems from indexing the site.
What to change: Reconfigure the Akamai WAF to allow all legitimate AI crawlers (GPTBot, Google-Extended, ClaudeBot, etc.) to access the site, or remove the WAF-level bot blocking entirely.
robots.txt is only accessible to anthropic-ai High
The robots.txt file returns 403 for all other bots (ClaudeBot, Google-Extended, GPTBot), meaning only anthropic-ai can read the crawl directives. This further limits visibility.
What to change: Ensure robots.txt is publicly accessible to all user-agents by removing the WAF block on that URL.
llms.txt is blocked for all bots except anthropic-ai High
The llms.txt file (643 KB) is served only to anthropic-ai; other bots receive a 403. This prevents AI systems from using the file for knowledge extraction.
What to change: Make llms.txt publicly accessible to all user-agents by removing the WAF block.
About, service, and finance pages return 404 errors High
The about-us.htm, service/index.htm, and finance/index.htm pages all return 404 errors. These are essential pages for AI knowledge extraction and user trust.
What to change: Restore or create the about, service, and finance pages with accurate, crawlable content.
No JSON-LD structured data detected on any page High
The site relies solely on Open Graph meta tags and HTML meta descriptions; no JSON-LD schema was found. This limits AI understanding of dealership details, inventory, and location.
What to change: Add JSON-LD structured data for LocalBusiness, AutoDealer, VehicleInventory, and Service on relevant pages.
Sitemap lists last-modified dates nearly a year in the future Medium
The sitemap.xml contains last-modified dates of 2026-05-30, which is nearly a year ahead of the current date. This date injection bug may confuse crawlers and reduce indexing efficiency.
What to change: Fix the platform-level date injection bug to output accurate last-modified dates.
Zero external discoverability across search engines and review platforms High
Multiple web searches for the brand name, domain, and dealership terms returned zero indexed results. No reviews, press, or backlinks were found, indicating the site has no external footprint.
What to change: Build external signals through local SEO, review generation, social media presence, and backlink acquisition.
LLM knowledge about the dealership is outdated and incorrect Medium
The LLM described the dealership as selling Ford, Lincoln, and Mazda, but the actual site sells Buick, GMC, Chrysler, Chevrolet, Ford, Dodge, Jeep, Ram, Toyota, and Volvo. The about page 404 prevents verification.
What to change: Publish accurate about and inventory pages with structured data to correct AI knowledge.
Homepage title references only North Chesterfield, not full brand scope Low
The homepage title is 'North Chesterfield's Haley Automotive Group', which may underrepresent the multi-franchise nature of the dealership. This could limit local SEO and AI understanding.
What to change: Update the homepage title to include key brands or 'Multi-Franchise Dealership' for better semantic signals.
What's working
- anthropic-ai receives full HTML pages with complete content — The anthropic-ai crawler bypasses Akamai and receives full HTML pages from nginx, including inventory, contact, and privacy pages. This ensures at least one AI system can index the site's content.
- llms.txt file exists and is comprehensive (643 KB) — The llms.txt file is one of the largest observed, containing extensive content for AI knowledge extraction. It is served to anthropic-ai and could be a valuable resource if made public.
- Sitemap exists with thousands of URLs — The sitemap.xml (422 KB) lists thousands of URLs, indicating a large inventory of pages that could be indexed if crawler access is fixed.
- Contact page is accessible and returns 200 — The contact.htm page loads successfully and contains dealership contact information, which is essential for local SEO and AI knowledge.
- Privacy page is accessible and returns 200 — The privacy.htm page loads successfully, providing legal and compliance information.
- New and used inventory pages return 200 with full content — The new-inventory and used-inventory index pages load successfully with full HTML content, including vehicle listings that can be indexed by anthropic-ai.
Track haleyauto.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.