AI Site Grade
pohanka.com — AI Site Grade
Pohanka's Cloudflare WAF blocks all major AI crawlers except ChatGPT-User, OAI-SearchBot, and Perplexity-User, making the site invisible to Google and most AI training pipelines.
Pohanka Automotive Group's selective bot access and lack of structured data severely limit AI visibility and search engine indexing.
- Findings
- 10
- Evidence checks
- 38
- Completed
- 30 May 2026
Analysis
Selective Bot Access Creates an Invisible Site
Pohanka Automotive Group's Cloudflare WAF blocks every major AI crawler except ChatGPT-User, OAI-SearchBot, and Perplexity-User, while also blocking all human browsers from the same IP range -- meaning the site is effectively invisible to both Google and most AI training pipelines.
Crawler Access
The robots.txt at pohanka.com/robots.txt exists and is accessible to allowed bots, but contains no rules for GPTBot, ClaudeBot, Google-Extended, or PerplexityBot. The file disallows common functional paths (/request-more-info/, /eprice/, /lead/) and blocks MJ12bot, BLEXBot, and other scrapers entirely. The llms.txt file exists and returns a basic directory of pages, but is only reachable by the three whitelisted bots. Google-Extended, GPTBot, and ClaudeBot all receive HTTP 403 from Cloudflare's WAF, as does any standard browser request. Only ChatGPT-User, OAI-SearchBot, and Perplexity-User pass through to real 200 responses with full HTML content (700KB+ pages). The DNS records confirm Cloudflare hosting (104.17.37.150 etc.) and an anthropic-domain-verification TXT record, yet ClaudeBot is still blocked.
Cold-Knowledge Gap
LLM knowledge describes Pohanka as a "family-owned chain founded in 1966 by John Pohanka, one of the largest family-owned dealership groups in the Mid-Atlantic, selling Honda, Acura, Nissan, Subaru, and Chrysler/Dodge/Jeep/Ram." The actual site tells a different story: the homepage meta keywords list Acura, Chevrolet, Honda, Hyundai, Lexus, Mercedes-Benz, Nissan, Toyota, and Volkswagen -- Subaru and Chrysler/Dodge/Jeep/Ram are absent. The llms.txt also mentions Texas locations, which the cold knowledge does not reference. The site's pohankas-story/ page (accessible only to whitelisted bots) redirects from /about-us/ and carries a canonical to itself, but its full narrative content is invisible to the models that lack whitelist access.
Schema Posture
The homepage and all tested subpages contain zero JSON-LD schema markup. No AutoDealer, LocalBusiness, Product, or Vehicle structured data appears in any fetched page. The pages are large (700KB-3MB) and JS-heavy, running on the dealereprocess.org platform, which suggests inventory data is loaded client-side. The sitemap index reveals separate sitemaps for menus, blog posts (empty), and inventory search by location (Alexandria, Bethesda, Vienna, Tysons Corner), but the blog sitemap contains zero URLs.
External Signals
No external search results were returned for any Pohanka-related query, indicating the site has minimal indexed presence in search engines. The robots.txt allows Google-InspectionTool but the WAF blocks Google-Extended -- a contradiction that likely prevents Google from indexing the site at all. The DNS includes a google-site-verification token, suggesting past Search Console setup, but the current WAF configuration nullifies it.
Findings
Cloudflare WAF blocks all major AI crawlers except three High
The Cloudflare WAF returns HTTP 403 for GPTBot, ClaudeBot, Google-Extended, and standard browsers, while only ChatGPT-User, OAI-SearchBot, and Perplexity-User receive 200 responses. This blocks most AI training and search indexing.
What to change: Update Cloudflare WAF rules to allow GPTBot, ClaudeBot, Google-Extended, and other standard AI crawlers, and remove the blanket block on non-whitelisted user-agents.
robots.txt lacks rules for major AI bots High
The robots.txt file contains no directives for GPTBot, ClaudeBot, Google-Extended, or PerplexityBot, leaving their access undefined. It only disallows functional paths and blocks scrapers like MJ12bot.
What to change: Add explicit allow/disallow rules for GPTBot, ClaudeBot, Google-Extended, and PerplexityBot in robots.txt.
Zero JSON-LD schema markup on any page High
No structured data (AutoDealer, LocalBusiness, Product, Vehicle) is present on the homepage or any tested subpage. This prevents AI models from understanding the site's content and offerings.
What to change: Add JSON-LD structured data for AutoDealer, LocalBusiness, and Vehicle on relevant pages.
Google indexing blocked by WAF despite robots.txt allowance High
Robots.txt allows Google-InspectionTool, but the WAF blocks Google-Extended with HTTP 403, preventing Google from indexing the site. No external search results were found for any Pohanka-related query.
What to change: Allow Google-Extended through the WAF and ensure robots.txt does not conflict.
llms.txt only accessible to whitelisted bots Medium
The llms.txt file exists and contains a directory of pages, but is only reachable by ChatGPT-User, OAI-SearchBot, and Perplexity-User. Other AI crawlers receive 403, defeating its purpose.
What to change: Make llms.txt publicly accessible by removing the WAF block for that path.
Inventory data loaded client-side via JavaScript Medium
Pages are large (700KB-3MB) and JS-heavy, running on the dealereprocess.org platform. Inventory data is likely loaded client-side, making it invisible to crawlers that do not execute JavaScript.
What to change: Implement server-side rendering or pre-rendered inventory pages to ensure content is accessible to all crawlers.
Blog sitemap contains zero URLs Low
The blog sitemap at /resrc/xmlsitemap/sitemap-blog-posts/ returns only 171 bytes and contains no URLs, indicating no blog content is published or indexed.
What to change: Populate the blog with relevant content and ensure the sitemap includes all blog post URLs.
LLM cold knowledge contradicts site content Medium
LLM knowledge describes Pohanka as selling Subaru and Chrysler/Dodge/Jeep/Ram, but the site's meta keywords list Acura, Chevrolet, Honda, Hyundai, Lexus, Mercedes-Benz, Nissan, Toyota, and Volkswagen. The llms.txt also mentions Texas locations not in cold knowledge.
What to change: Update the site's meta keywords and llms.txt to accurately reflect current brands and locations, and ensure consistency with external knowledge sources.
Sitemap index returns 403 for standard browsers Medium
The sitemap.xml is blocked by Cloudflare for standard browsers (403), though accessible to whitelisted bots. This prevents search engines from discovering the site's structure.
What to change: Make sitemap.xml publicly accessible by removing the WAF block.
No external search results found for any Pohanka query High
Multiple web searches for 'Pohanka' and related terms returned zero results, indicating the site has minimal to no indexed presence in search engines.
What to change: Resolve WAF blocks and implement proper SEO to allow search engine indexing.
What's working
- llms.txt file exists and provides a directory — The site has an llms.txt file that lists key pages, helping whitelisted AI crawlers discover content.
- robots.txt is accessible to allowed bots — The robots.txt file is reachable by whitelisted bots and contains basic disallow rules for functional paths.
- Sitemap index accessible to whitelisted bots — The sitemap.xml is served to allowed bots, providing a structured list of sitemaps for menus, inventory, and blog posts.
- Inventory sitemaps organized by location — The sitemap index includes separate inventory sitemaps for Alexandria, Bethesda, Vienna, and Tysons Corner, aiding crawlers in discovering location-specific inventory.
- Google site verification token in DNS — A google-site-verification TXT record exists, indicating past Search Console setup, which can be leveraged once indexing is restored.
- Anthropic domain verification TXT record exists — An anthropic-domain-verification TXT record is present, showing intent to allow Claude access, though currently blocked by WAF.
Track pohanka.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.