AI Site Grade
nesn.com — AI Site Grade
NESN's robots.txt allows AI bots but Cloudflare blocks them with a 403, leaving the site invisible to the AI engines most likely to cite it in answers about Boston sports.
NESN's AI visibility is crippled by a Cloudflare WAF that blocks all major AI crawlers despite permissive robots.txt rules, while the site's digital publishing transformation remains unknown to frontier LLMs.
- Findings
- 7
- Evidence checks
- 25
- Completed
- 30 May 2026
Analysis
I have enough data to write the audit. Let me compile it.
NESN's robots.txt tells OpenAI and Perplexity they are welcome, but Cloudflare blocks them at the edge with a 403 — a contradiction that leaves the site invisible to the AI engines most likely to cite it in answers about Boston sports.
Crawler Access
The robots.txt at nesn.com explicitly allows OAI-SearchBot, PerplexityBot, and ChatGPT-User while disallowing GPTBot, ClaudeBot, anthropic-ai, Google-Extended, Bytespider, and Applebot-Extended. However, live compare_bot_access testing reveals that Cloudflare's WAF overrides the robots.txt permissions entirely. OAI-SearchBot, PerplexityBot, Perplexity-User, ChatGPT-User, ClaudeBot, GPTBot, and anthropic-ai all receive HTTP 403 "Your request was blocked" with a 25-byte response. Only Google-Extended and Applebot-Extended get through with full 200-status content (447 KB). The robots.txt says one thing; Cloudflare enforces another. No llms.txt exists (404).
Cold-Knowledge Gap
A frontier LLM queried cold about NESN knows it as a regional cable sports network that broadcasts Red Sox and Bruins games, mentions the 2023 data breach and carriage disputes with Comcast/YouTube TV, and correctly identifies ownership by Fenway Sports Group. What the model does not know: NESN has transformed into a substantial digital publishing operation. The site publishes dozens of articles daily across Red Sox, Bruins, Patriots, Celtics, and other sports — all with rich NewsArticle schema, author bios, and timestamps. The cold knowledge is stuck on the cable-TV identity and misses the modern content-site reality entirely.
Schema Posture
The homepage carries Organization and WebSite schema with social links but no SearchAction or potentialAction for site search. Article pages use proper NewsArticle schema with datePublished, dateModified, author (typed as Person with worksFor linking to Sporting News), headline, and keywords. The NewsArticle.backstory field is used as a description container — a non-standard but functional pattern. The nesn360 streaming landing page has zero schema — no Product, no SoftwareApplication, no WebPage — despite being a paid subscription service with clear pricing tiers ($5/mo intro, $29.99/mo standard, $239.99/year).
External Signals
The cold LLM knowledge references a 2023 data breach and carriage disputes — negative signals that shape AI-generated descriptions of the brand. The site itself does not address these on any fetched page. No external search results surfaced for recent NESN Reddit discussions or press coverage, suggesting the brand has limited off-domain conversation volume relative to its market presence. The site carries affiliate links to Fanatics and StubHub and is hosted on Cloudflare with Microsoft 365 mail, OneTrust consent management, and Adobe IDP verification.
Findings
Cloudflare WAF blocks all major AI crawlers despite permissive robots.txt High
The robots.txt allows OAI-SearchBot, PerplexityBot, and ChatGPT-User, but Cloudflare's WAF returns HTTP 403 for all of them, as well as for GPTBot, ClaudeBot, and anthropic-ai. Only Google-Extended and Applebot-Extended receive full content.
What to change: Update Cloudflare WAF rules to allow OAI-SearchBot, PerplexityBot, ChatGPT-User, and other permitted AI crawlers through, matching the robots.txt permissions.
No llms.txt file published Medium
The site returns a 404 for /llms.txt, missing an opportunity to provide AI crawlers with a curated set of URLs and context.
What to change: Create an llms.txt file listing key pages (e.g., team sections, streaming page) and a brief site description.
Frontier LLMs unaware of NESN's digital publishing operation High
Cold knowledge of NESN is limited to its cable TV identity, data breach, and carriage disputes. The model does not know that NESN publishes dozens of daily articles with NewsArticle schema across multiple sports.
What to change: Improve AI crawler access and consider submitting content to AI training datasets or using structured data to highlight the breadth of digital content.
Homepage schema lacks SearchAction Medium
The Organization and WebSite schemas on the homepage do not include a potentialAction for site search, reducing the chance of AI assistants offering direct search functionality.
What to change: Add a SearchAction potentialAction to the WebSite schema with target query parameter.
NESN 360 streaming landing page has zero schema markup High
The /nesn360 page, which describes a paid subscription service with pricing tiers, lacks any structured data (Product, SoftwareApplication, or WebPage schema).
What to change: Add Product or SoftwareApplication schema with name, description, offers (price, currency), and applicationCategory.
Site does not address known negative signals (data breach, carriage disputes) Medium
Cold LLM knowledge includes a 2023 data breach and carriage disputes, but no fetched page mentions these issues or provides official statements, leaving negative narratives unchallenged.
What to change: Publish a dedicated page or FAQ addressing the data breach and current carriage status, and link to it from relevant sections.
Low external discussion volume on Reddit and search Low
Searches for NESN on Reddit and general web returned zero results, indicating limited off-domain conversation that could influence AI training data.
What's working
- Article pages use proper NewsArticle schema with author and timestamps — Each article includes NewsArticle schema with datePublished, dateModified, author (Person with worksFor), headline, and keywords, which helps AI crawlers understand content.
- Google-Extended and Applebot-Extended are allowed and served full content — These two AI crawlers receive 200 responses with full HTML, ensuring Google's AI products can index the site.
- Robots.txt explicitly allows OAI-SearchBot, PerplexityBot, and ChatGPT-User — The robots.txt file is configured to permit these important AI crawlers, indicating intent to be AI-visible, even though Cloudflare blocks them.
- Homepage includes Organization and WebSite schema with social links — The homepage has basic structured data providing organization name, logo, and social media URLs, which helps establish brand identity.
Track nesn.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.