AI Site Grade
siouxvalleyenergy.com — AI Site Grade
Sioux Valley Energy blocks OAI-SearchBot at the WAF while allowing GPTBot, has zero JSON-LD schema, and lacks an about page — creating a fragmented AI visibility posture.
Sioux Valley Energy's AI crawler access is inconsistent (OAI-SearchBot blocked, others allowed), the site has no structured data, and key pages are missing or thin, limiting AI discoverability.
- Findings
- 10
- Evidence checks
- 25
- Completed
- 30 May 2026
Analysis
I have a comprehensive picture now. Let me write the audit.
OAI-SearchBot is blocked while GPTBot roams free — and the site has no structured data for AI engines to consume
The most consequential finding is a split-personality AI-access policy: OAI-SearchBot gets a 403 from the Sucuri/Cloudproxy WAF while GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, and Google-Extended all return full 200s with identical content to a browser. The robots.txt has zero AI-bot-specific rules — only a generic User-agent: * blocklist for Drupal admin paths. The llms.txt returns 404. The site is running Drupal behind Sucuri/Cloudproxy on Pantheon infrastructure, with a strict CSP and HSTS preload.
Crawler Access
Every major AI crawler except OAI-SearchBot receives the same 177 KB HTML payload as a browser. OAI-SearchBot is blocked at the WAF level (403) — this means OpenAI's search/retrieval product (ChatGPT with search, GPTs with browsing) cannot read the site, even though GPTBot (training crawler) can. The robots.txt is a stock Drupal template with no Disallow: / for any AI agent, no Allow: / for GPTBot, and no crawl-delay directive. The site does not serve X-Robots-Tag headers or meta robots tags on any page inspected.
Cold-Knowledge Gap
The LLM prior knows Sioux Valley Energy as a ~27,000-member cooperative serving southeastern South Dakota and southwestern Minnesota, part of the Touchstone Energy network, with community solar and energy efficiency programs. The actual site says 30,287 meters, 115 employees, 6,259 miles of line, and 4.8 member density per mile — none of these specific operational stats appear in the cold knowledge. The model also has no awareness of the 9.1% rate increase effective January 2026, the 2026 annual meeting scheduled for June 9, or the detailed FAQ addressing data-center-driven power cost concerns. The gap is significant: the site's most newsworthy content (rate increases, power-supply cost drivers, REC program details) is invisible to the model's prior.
Schema Posture
Zero JSON-LD schema of any type was found on any page inspected — homepage, rate-increase FAQ, programs page, outage center, contact page. No Organization, ElectricUtility, FAQPage, LocalBusiness, or WebSite schema. The rate-increase page has a rich FAQ structure (14 Q&A pairs with H3 headings) that is a textbook candidate for FAQPage markup but is delivered as plain HTML headings. The rate schedules page contains structured pricing data (rate classes, per-kWh charges, basic service fees) with no Product or PriceSpecification schema.
External Signals
The site has active social presence (Facebook, Instagram, YouTube, LinkedIn, Vimeo) and uses SmartHub for member billing, Nova Power Portal for outage maps, and ApplicantPro for job postings. Web search returned zero indexed external mentions for the brand name — no news articles, no Reddit threads, no review sites surfaced. The cooperative's external citation footprint is effectively nil, which means AI engines have almost no third-party signals to corroborate or enrich the brand's self-published content.
Surprising Findings
The /about-us and /about-sve URLs both return 404, meaning the site has no dedicated about page — a gap for any AI crawler trying to establish entity identity. The outage center redirects from /outage-center to /outage-center-2025 (a dated URL), and the canonical on that page is the -2025 variant. The sitemap contains 291 URLs but is dominated by monthly newsletter/magazine issues going back to 2022, with many duplicate-looking paths (/issues/september-2024 vs /issues/september-2024-0). The energy-solutions-catalog page is a thin shell (~50 words) pointing to a PDF download — the actual catalog content is not machine-readable.
Findings
OAI-SearchBot blocked at WAF while other AI bots allowed High
OAI-SearchBot receives a 403 from the Sucuri/Cloudproxy WAF, preventing OpenAI's search/retrieval products from accessing the site. GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, and Google-Extended all return 200 with full content.
What to change: Allow OAI-SearchBot through the WAF by whitelisting its user-agent or IP range, and ensure consistent access for all major AI crawlers.
Zero JSON-LD schema on any inspected page High
No structured data markup (JSON-LD) was found on the homepage, rate-increase FAQ, programs page, outage center, or contact page. The site lacks Organization, ElectricUtility, FAQPage, or any other schema types that help AI engines understand content.
What to change: Add JSON-LD schema for Organization, ElectricUtility, FAQPage (on the rate-increase page), and WebSite across the site.
About page returns 404 High
Both /about-us and /about-sve URLs return 404 errors, meaning the site has no dedicated about page. This prevents AI crawlers from establishing entity identity and context.
What to change: Create an about page with cooperative history, service area, leadership, and operational statistics, and ensure it is accessible from the main navigation.
llms.txt file returns 404 Medium
The site does not serve an llms.txt file, which is a recommended standard for providing AI crawlers with a curated list of important pages and context.
What to change: Create an llms.txt file listing key pages (rate increase, programs, outage center, contact) and a brief description of the cooperative.
Key operational details missing from LLM prior knowledge Medium
The LLM prior lacks specific statistics (30,287 meters, 115 employees, 6,259 miles of line, 4.8 member density per mile) and recent news (9.1% rate increase, 2026 annual meeting, REC program details). This content is on the site but not surfaced to AI models.
What to change: Add structured data (schema.org) and ensure key pages are included in sitemap and llms.txt to improve AI discoverability.
Energy Solutions Catalog page is a thin shell with PDF link Medium
The /energy-solutions-catalog page contains only ~50 words and points to a PDF download. The actual catalog content is not machine-readable, limiting AI understanding.
What to change: Expand the catalog page with HTML content summarizing the catalog's offerings, and keep the PDF as a supplement.
Outage center redirects to dated URL with canonical issue Low
The /outage-center URL redirects to /outage-center-2025, and the canonical tag points to the -2025 variant. This creates a confusing URL structure for crawlers.
What to change: Update the canonical URL to the preferred path (e.g., /outage-center) and remove the redirect or update the canonical to match.
Zero indexed external mentions for the brand Medium
Web searches for the brand name returned no news articles, reviews, or third-party references. The cooperative has no external citation footprint to corroborate its content.
What to change: Encourage local news coverage, partner mentions, and member testimonials to build external signals.
Robots.txt has no AI-specific directives Low
The robots.txt file contains only generic Drupal admin path blocks and no rules for AI crawlers (no Disallow for GPTBot, no Allow for OAI-SearchBot, no crawl-delay).
What to change: Add explicit rules for AI crawlers: allow GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended; disallow OAI-SearchBot if intentional, or allow if desired.
Sitemap dominated by newsletter issues with duplicates Low
The sitemap contains 291 URLs, mostly monthly newsletter/magazine issues with duplicate-looking paths (e.g., /issues/september-2024 and /issues/september-2024-0). This dilutes the importance of core pages.
What to change: Clean up duplicate newsletter URLs and prioritize core pages (rate increase, programs, outage center) in the sitemap.
What's working
- Most major AI crawlers receive full content access — GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, and Google-Extended all return 200 with the same HTML as a browser, ensuring training data includes the site's content.
- Rate increase page has detailed FAQ with 14 Q&A pairs — The 2026 rate increase page contains a well-structured FAQ with 14 questions and answers in H3 headings, covering power cost drivers and rate details. This is prime content for FAQPage schema.
- Active social media presence across multiple platforms — The cooperative maintains Facebook, Instagram, YouTube, LinkedIn, and Vimeo accounts, providing external signals and engagement opportunities.
- Outage center provides detailed outage information — The outage center page contains 952 words of useful content, including outage map integration, reporting instructions, and safety tips.
- Rate schedules page with structured pricing data — The rate schedules page lists rate classes, per-kWh charges, and basic service fees in a clear tabular format, making it easy to extract pricing information.
- Contact page with multiple contact methods — The contact page provides phone numbers, email, and physical address, enabling AI crawlers to extract entity contact information.
- Programs and rebates page with extensive content — The programs and rebates page contains 1,931 words detailing various energy efficiency programs, rebates, and incentives.
Track siouxvalleyenergy.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.