AI Site Grade
maplehillauto.com — AI Site Grade
Maple Hill Auto's Akamai WAF blocks all AI crawlers except anthropic-ai, creating a single-bot monoculture and leaving AI models with a cold-knowledge gap that misidentifies the dealership as a used-car lot.
Maple Hill Auto's selective Akamai WAF allows only anthropic-ai to access the site, while all other AI crawlers and browsers are blocked, resulting in zero external web presence and a severe cold-knowledge gap that misrepresents the dealership as a used-car seller.
- Findings
- 10
- Evidence checks
- 38
- Completed
- 30 May 2026
Analysis
---
Maple Hill Auto: A Site Only Claude Can See
The site's Akamai WAF blocks every browser and every AI crawler except anthropic-ai, which gets a full 200 with ~419KB of rendered HTML — a selective backdoor that creates a single-bot monoculture for AI visibility.
Crawler Access
The robots.txt (only readable by anthropic-ai; browser and all other bots get 403 from Akamai) contains rules for GPTBot, OAI-SearchBot, ChatGPT-User, Claude-User, Claude-SearchBot, and PerplexityBot — but all are only blocked from /api/, /apis/, /pixall/, and static assets. None are disallowed from the main content. The problem is that Akamai's WAF blocks them before they ever reach the robots.txt or the site content. The compare_bot_access test on https://www.maplehillauto.com returned 403 for every UA except anthropic-ai, which got 200 from an nginx origin server (bypassing Akamai entirely). The site runs on the DDC (Dealer.com) platform, a common automotive CMS, and the non-www domain (maplehillauto.com) is completely unreachable (connection refused).
Cold-Knowledge Gap
A frontier LLM queried cold describes Maple Hill Auto as a "used car dealership specializing in pre-owned vehicles... often focusing on affordable, reliable cars for budget-conscious buyers" with inventory of Ford, Chevrolet, Honda, and Toyota under $15,000. The actual site is a multi-franchise new-car dealer for Subaru, Hyundai, Volkswagen, Audi, and Volvo — five premium/luxury brands. The LLM knows nothing about the new-car side, the five OEM franchises, or the Kalamazoo-area service area. This is a near-total identity mismatch between what AI models know and what the site actually is.
Schema and Content Posture
The homepage and all subpages (new inventory, used inventory, financing, specials) are rich with content — meta descriptions, OG tags, index, follow directives — but the fetch_url tool (browser UA) could not extract any of it because every page returns a 403 to browser UAs. The site is a JS-heavy single-page application shell on the DDC platform, but the anthropic-ai bot receives server-rendered HTML. No JSON-LD structured data was detected on any fetched page. The llms.txt file exists and is well-formed, listing 20+ pages with descriptions — a rare and positive signal — but it is only accessible to the anthropic-ai bot.
External Signals
Web searches for "Maple Hill Auto" returned zero results across multiple queries — no reviews, no press, no Reddit threads, no dealer listings. The domain has no Wayback Machine history. The DNS shows Microsoft 365 mail (Outlook) and SendGrid for email, with GoDaddy nameservers. The complete absence of external footprint means AI models have almost no third-party signals to triangulate from, making the site's own content and the anthropic-ai backdoor the only reliable sources of truth.
Sitemap Anomaly
The sitemap.xml lists a lastmod date of 2026-05-30 — over a year in the future. This may confuse crawlers about freshness signals. The sitemap contains ~100+ URLs including individual vehicle detail pages, but the /about/index.htm page returns a 404 (soft 404 with a branded "Oops!" page), meaning there is no about-us page for AI engines to learn the dealership's story.
Findings
Akamai WAF blocks all AI crawlers except anthropic-ai High
The site's Akamai Web Application Firewall returns 403 for every user agent except anthropic-ai, which receives a 200 response from an nginx origin server. This creates a single-bot monoculture where only Claude can access the site's content.
What to change: Reconfigure the Akamai WAF to allow other major AI crawlers (GPTBot, OAI-SearchBot, Claude-User, etc.) to access the site, or implement a more permissive access policy that does not rely on user-agent filtering alone.
Cold-knowledge gap misidentifies dealership as used-car seller High
A frontier LLM queried cold describes Maple Hill Auto as a used-car dealership specializing in affordable pre-owned vehicles, but the actual site is a multi-franchise new-car dealer for Subaru, Hyundai, Volkswagen, Audi, and Volvo. The LLM knows nothing about the new-car side or the five OEM franchises.
What to change: Publish structured data (JSON-LD) on the homepage and inventory pages that explicitly states the dealership type, brands sold, and new-car inventory. Also ensure the site is accessible to multiple AI crawlers so models can learn the correct identity.
No JSON-LD structured data detected on any page High
No JSON-LD structured data was found on the homepage, inventory pages, or any other fetched page. This means AI crawlers cannot easily extract key business information such as dealership type, brands, inventory, or contact details.
What to change: Add JSON-LD structured data for AutoDealer, Vehicle, and LocalBusiness schemas to all relevant pages, including the homepage, inventory listings, and vehicle detail pages.
Zero external web presence across search engines High
Web searches for 'Maple Hill Auto' and related queries returned zero results. No reviews, press articles, dealer listings, or social media mentions were found. The domain has no Wayback Machine history, leaving AI models with no third-party signals to triangulate from.
What to change: Build an external online presence by claiming business listings on Google Business Profile, Yelp, DealerRater, and other automotive directories. Encourage customer reviews and publish press releases or blog content to generate indexed pages.
Browser user agents receive 403 on all pages High
Every page on the site returns a 403 Forbidden error when accessed with a standard browser user agent. This means human visitors and most AI crawlers cannot view the site content, severely limiting organic discovery and user access.
What to change: Remove the blanket 403 for browser user agents, or implement a more nuanced access control that allows legitimate human traffic while still protecting against abuse.
Non-www domain is completely unreachable Medium
The non-www version of the domain (maplehillauto.com) returns a connection refused error, meaning it is not configured to serve any content. This can cause confusion for crawlers and users who may attempt to access the site without the www prefix.
What to change: Configure the non-www domain to redirect (301) to the www version, or serve the same content on both domains.
Sitemap lastmod date is over a year in the future Medium
The sitemap.xml lists a lastmod date of 2026-05-30, which is more than a year ahead of the current date. This may confuse crawlers about content freshness and could lead to indexing issues.
What to change: Update the sitemap to use accurate lastmod dates that reflect when pages were actually last modified.
About page returns 404 (soft error) Medium
The /about/index.htm page returns a 404 status code with a branded 'Oops!' page, meaning there is no functional about-us page for AI engines to learn the dealership's story or history.
What to change: Create a proper about-us page with content about the dealership's history, brands, and team, and ensure it returns a 200 status code.
robots.txt is inaccessible to most bots due to WAF Medium
The robots.txt file returns 403 to browser user agents and most AI crawlers, only being accessible to anthropic-ai. This prevents other crawlers from understanding the site's crawl directives.
What to change: Ensure robots.txt is publicly accessible without authentication or WAF filtering, so all crawlers can read it.
llms.txt only accessible to anthropic-ai Medium
The llms.txt file exists and is well-formed, listing 20+ pages, but it is only accessible to the anthropic-ai bot due to the WAF. Other AI crawlers cannot benefit from this resource.
What to change: Make llms.txt publicly accessible so all AI crawlers can use it to discover important pages.
What's working
- llms.txt file published with 20+ page listings — The site has a well-formed llms.txt file that lists over 20 pages with descriptions, providing a clear entry point for AI crawlers to discover key content.
- anthropic-ai bot receives full server-rendered HTML — The anthropic-ai bot receives a 200 response with ~419KB of rendered HTML, including meta tags, OG tags, and rich content, enabling Claude to index the site effectively.
- Sitemap accessible to anthropic-ai with 100+ URLs — The sitemap.xml is accessible to the anthropic-ai bot and contains over 100 URLs including vehicle detail pages, providing a comprehensive list of pages for crawling.
- Inventory pages contain rich content with meta tags — New and used inventory pages include meta descriptions, OG tags, and index/follow directives, providing structured metadata for crawlers that can access them.
- robots.txt does not disallow major AI crawlers from content — The robots.txt file only disallows AI crawlers from /api/, /apis/, /pixall/, and static assets, meaning the site's content pages are not blocked by robots.txt rules themselves.
Track maplehillauto.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.