AI Site Grade
hvcu.org — AI Site Grade
HVCU's Incapsula WAF gates browser users but lets GPTBot and other AI crawlers through, yet the site's NOINDEX tag and missing schema prevent AI visibility.
Hudson Valley Credit Union's site allows AI crawlers past its WAF but cripples AI visibility with a global NOINDEX tag, zero JSON-LD schema, and a cold-knowledge gap on its founding story and asset scale.
- Findings
- 8
- Evidence checks
- 38
- Completed
- 30 May 2026
Analysis
---
Incapsula WAF selectively gates AI crawlers — but not the ones that matter most
The site runs behind Imperva Incapsula (detected via the _Incapsula_Resource JS challenge served to browser UAs). A standard browser GET returns a 956-byte JS challenge iframe with NOINDEX, NOFOLLOW. However, GPTBot, PerplexityBot, Google-Extended, and ClaudeBot all receive the full 183KB HTML page with meta tags, titles, and descriptions — the WAF lets them through. Meanwhile, ChatGPT-User and OAI-SearchBot get the same thin 212-byte shell as Applebot-Extended, suggesting Imperva's bot classification treats these differently from GPTBot.
Crawler Access
The robots.txt at https://www.hvcu.org/robots.txt contains no AI-bot-specific directives — no mention of GPTBot, ClaudeBot, PerplexityBot, Google-Extended, or any other AI crawler. The wildcard rule (User-agent: *) disallows only search query parameters and media folders. The llms.txt returns a 404. The sitemap at https://www.hvcu.org/sitemap.xml exists and lists 364 URLs, but every page fetched with a browser UA returns the same Incapsula JS challenge shell with zero extractable content.
Cold-Knowledge Gap
The LLM prior on HVCU is surprisingly rich: it knows the credit union was founded in 1960 as the IBM Poughkeepsie Employees Federal Credit Union, has over $4 billion in assets, serves the Hudson Valley and Capital Region, and offers competitive auto loans and a Skip-a-Payment feature. The site itself never surfaces the IBM founding story or the $4B asset figure in any meta description or visible content — these facts are absent from the HTML that AI crawlers ingest. The homepage meta description says "serving New York's Hudson Valley for over 60 years and newly serving the Capital Region," but the origin story and scale are missing from crawlable content.
Schema Posture
Zero JSON-LD schema was detected on any page tested — no Organization, FinancialService, FAQPage, Product, or LocalBusiness markup. The FAQ page at /online-services/support/faqs/ has no FAQPage schema despite containing question-and-answer content. The comparison page at /personal/borrow/home-equity/comparison/ has no Comparison or Product schema. The credit card comparison page lacks schema entirely. This means AI engines extracting structured data from the site get nothing.
Content & Architecture
All pages carry a <meta name="robots" content="NOINDEX, NOFOLLOW"> tag in the server-rendered HTML — including the homepage. This is likely a CMS default that was never corrected, and it directly instructs search engine crawlers (including Google) not to index any page. The site runs on Kentico CMS (detected via /kentico.resource/ paths and Kentico.Content.Web.Rcl references) with Hawksearch for site search. The Twitter card references KXtwitteraccount — a placeholder that was never replaced. The site has no Wayback Machine snapshots, suggesting it may be relatively new or has blocked archiving.
Findings
NOINDEX, NOFOLLOW meta tag on every page including homepage High
All server-rendered pages carry a <meta name="robots" content="NOINDEX, NOFOLLOW"> tag, instructing search engines not to index any content. This is likely a CMS default that was never corrected.
What to change: Remove the NOINDEX, NOFOLLOW meta tag from all pages, or change it to INDEX, FOLLOW for pages intended to be indexed.
No JSON-LD schema on any tested page High
No Organization, FinancialService, FAQPage, Product, or LocalBusiness schema was detected. The FAQ page lacks FAQPage schema, and comparison pages lack Comparison or Product schema.
What to change: Add JSON-LD structured data: Organization schema on the homepage, FinancialService on product pages, FAQPage on the FAQ page, and Product/Comparison schema on comparison pages.
Founding story and asset scale missing from crawlable content Medium
The LLM prior knows HVCU was founded in 1960 as IBM Poughkeepsie Employees Federal Credit Union and has over $4 billion in assets, but these facts are absent from the site's meta descriptions and visible HTML.
What to change: Include the founding year, origin story, and asset size in the homepage meta description and visible content.
Robots.txt lacks AI-bot-specific directives Medium
The robots.txt file has no rules for GPTBot, ClaudeBot, PerplexityBot, Google-Extended, or other AI crawlers. The wildcard rule only disallows search parameters and media folders.
What to change: Add explicit allow/disallow rules for AI crawlers (e.g., GPTBot, ClaudeBot) to control AI access.
llms.txt file returns 404 Low
The standard llms.txt file is not present, missing an opportunity to provide AI crawlers with a curated list of important pages.
What to change: Create an llms.txt file listing key pages for AI crawlers.
Twitter card references placeholder account Low
The Twitter card meta tag references 'KXtwitteraccount', a placeholder that was never replaced with the actual account.
What to change: Replace 'KXtwitteraccount' with the actual Twitter handle.
No Wayback Machine snapshots available Low
The site has no archived snapshots in the Wayback Machine, suggesting it may be relatively new or has blocked archiving.
Browser user agents receive Incapsula JS challenge with no content Medium
When fetched with a standard browser UA, the site returns a 956-byte JS challenge iframe with NOINDEX, NOFOLLOW and zero extractable content, blocking human-like crawlers.
What to change: Ensure that legitimate crawlers (e.g., Googlebot) are not blocked by the WAF; consider allowing them through.
What's working
- GPTBot, PerplexityBot, Google-Extended, and ClaudeBot receive full HTML content — Despite the WAF blocking browser UAs, major AI crawlers are allowed through and receive the full 183KB HTML page with meta tags, titles, and descriptions.
- Sitemap exists with 364 URLs — The sitemap at /sitemap.xml is accessible and lists 364 URLs, providing a roadmap for crawlers.
- LLM prior contains detailed knowledge about HVCU — The LLM knows HVCU's founding history, asset size, and product features, indicating some external signals exist despite site deficiencies.
Track hvcu.org across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.