AI Site Grade
cancer.ie — AI Site Grade
The Irish Cancer Society's site has perfect AI-crawler access but is structurally invisible to AI answer engines due to zero semantic schema beyond breadcrumbs and no AI-friendly content map.
The Irish Cancer Society's site has perfect AI-crawler access but is structurally invisible to AI answer engines due to zero semantic schema beyond breadcrumbs and no AI-friendly content map.
- Findings
- 10
- Evidence checks
- 23
- Completed
- 30 May 2026
Analysis
The Irish Cancer Society (cancer.ie) — AI-Visibility Audit
The site has perfect technical AI-crawler access — every major AI bot (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, Bytespider, Applebot-Extended, anthropic-ai) receives a full 200 response with identical 139KB payload — yet the site is structurally invisible to AI answer engines because it ships zero semantic schema beyond breadcrumbs and has no AI-friendly content map.
Crawler Access
All eleven tested AI user-agents return identical 200 status and byte size to the browser baseline. The site runs on nginx behind Cloudflare DNS (MX/NS records confirm Cloudflare) with Drupal 10 serving pages. No UA-based blocking, no WAF challenge, no JS-rendering dependency — the homepage and all sampled subpages deliver full HTML content to every crawler. The robots.txt contains a single User-agent: * rule with no AI-bot-specific directives, meaning no bot is explicitly welcomed or restricted beyond the generic Drupal disallows (/admin/, /search/, /user/). The llms.txt returns 404 — the site has no AI content map at all.
Schema Posture
Every page examined (homepage, cancer-information, breast-cancer, Daffodil Day, cancer-prevention, signs-and-symptoms) contains only BreadcrumbList JSON-LD — no MedicalWebPage, HealthTopicContent, FAQPage, Article, Organization, WebSite, or HowTo schema. For a health-information charity that publishes clinically reviewed content on dozens of cancer types, this is a critical gap. The breast cancer page, for example, has 898 words of structured medical content (signs, symptoms, treatments, risk factors, screening programmes) but no semantic markup that would let an AI engine surface it as a trusted health answer. The Daffodil Day page contains FAQ-style content (cost breakdowns, "Did you know?" items) but no FAQPage schema.
Cold-Knowledge Gap
The LLM prior knows the Irish Cancer Society as a trusted national charity — Daffodil Day, Night Nursing, Support Line, Daffodil Centres, transport service, research funding. This matches the site's actual content well. However, the prior recalls a "Cancer Nurseline" (the site calls it "Support Line" or "Freephone Support Line") and mentions a controversial 2023 "Cancer Doesn't Take a Holiday" campaign — neither of which appears anywhere on the current site. The site's own messaging is forward-looking (2025 impact figures, Daffodil Day 2026), creating a temporal disconnect between what AI models know (2023 controversy) and what the site projects (current services and impact).
External Signals
The site links to donors.cancer.ie, fundraise.cancer.ie, and relayforlife.ie as subdomain donation platforms. Social presence includes Facebook, Instagram, YouTube, Pinterest, Reddit, and LinkedIn — but the Reddit link (/r/IrishCancerSociety/) is a private or unpopulated subreddit (no results found in search). Web search returned zero results for recent press coverage, charity reviews, or Reddit discussions about the society — suggesting the brand has limited off-domain citation footprint that AI engines could use for corroboration.
Content Architecture Surprises
The sitemap contains 1,521 URLs including many node/XXXX paths (Drupal legacy content) alongside clean URLs. The cancer-information section uses a JavaScript-powered "Loading articles..." pattern for sub-page listings — crawlers see the loading placeholder text, not the actual article links, creating a discoverability gap for deep content. The homepage's "Your Impact in 2025" section displays zero values ("0") for all metrics in the raw HTML, with the real numbers loaded dynamically — AI crawlers see empty stats. The site has no Organization or WebSite schema on the homepage, meaning knowledge panels and rich search results are not being explicitly signalled.
Findings
No medical or health schema on cancer information pages High
Pages with clinically reviewed content on cancer types, signs, symptoms, and treatments lack MedicalWebPage, HealthTopicContent, or Article schema. AI engines cannot semantically understand the content as trusted health information.
What to change: Add MedicalWebPage or HealthTopicContent JSON-LD schema to all cancer information pages, including fields for medical specialty, relevant conditions, and evidence level.
No FAQPage schema on FAQ-style content Medium
The Daffodil Day page contains FAQ-style content (cost breakdowns, 'Did you know?' items) but no FAQPage schema, preventing AI from surfacing these answers directly.
What to change: Add FAQPage JSON-LD schema to pages with question-and-answer content, such as Daffodil Day.
No Organization or WebSite schema on homepage High
The homepage lacks Organization and WebSite schema, which are critical for knowledge panels and rich search results. AI engines cannot confidently associate the site with the Irish Cancer Society brand.
What to change: Add Organization and WebSite JSON-LD schema to the homepage with the charity's name, logo, URL, and description.
No llms.txt file for AI content map Medium
The site returns a 404 for llms.txt, meaning AI crawlers have no machine-readable guide to the site's content structure. This limits the site's discoverability in AI answer engines.
What to change: Create an llms.txt file that lists key content sections (cancer information, services, prevention) with brief descriptions and URLs.
JavaScript-powered article listings hide content from crawlers High
The cancer-information section uses a JavaScript 'Loading articles...' pattern for sub-page listings. AI crawlers see the placeholder text, not the actual article links, reducing discoverability of deep content.
What to change: Server-render the article listing links in HTML so crawlers can discover them without JavaScript execution.
Homepage impact stats display zero values in raw HTML Medium
The 'Your Impact in 2025' section shows '0' for all metrics in the raw HTML, with real numbers loaded dynamically. AI crawlers see empty statistics, undermining trust signals.
What to change: Include the actual impact numbers in the initial HTML payload, or use progressive enhancement to ensure crawlers see meaningful values.
LLM prior contains outdated or incorrect information Medium
The LLM prior recalls a 'Cancer Nurseline' (site uses 'Support Line') and a controversial 2023 campaign not present on the current site. This mismatch can cause AI to fabricate or misattribute information.
What to change: Publish a clear, up-to-date 'About us' page with current service names and campaign history, and consider adding a 'Fact check' section to correct common misconceptions.
Low off-domain citation footprint for AI corroboration Medium
Web searches returned zero results for recent press coverage, charity reviews, or Reddit discussions about the society. AI engines have few external signals to corroborate the site's authority.
What to change: Encourage media coverage, charity review sites, and community discussions to build a stronger external citation profile.
Linked Reddit subreddit is private or unpopulated Low
The site links to /r/IrishCancerSociety/ on Reddit, but no results were found in search, suggesting the subreddit is private or empty. This provides no social proof for AI.
What to change: Remove the Reddit link if the subreddit is inactive, or actively populate it with community content.
Robots.txt lacks explicit AI-bot directives Low
The robots.txt has only a generic User-agent: * rule with no specific welcome or restrictions for AI bots. While not blocking, it misses an opportunity to signal AI-friendly content.
What to change: Add explicit directives for AI bots (e.g., GPTBot, ClaudeBot) to welcome them and point to the llms.txt file once created.
What's working
- All major AI bots receive full HTML content — Eleven tested AI user-agents return identical 200 status and byte size as a browser, with no UA-based blocking, WAF challenges, or JS-rendering dependencies.
- All sampled pages deliver full HTML content to crawlers — Homepage and subpages like cancer-information, breast-cancer, and Daffodil Day serve complete HTML without JavaScript dependency for core content.
- LLM prior recognizes the Irish Cancer Society as a trusted national charity — The LLM prior correctly identifies the society's key services (Daffodil Day, Night Nursing, Support Line, Daffodil Centres, transport service, research funding), providing a strong foundation for AI visibility.
- BreadcrumbList schema present on all pages — Every page examined includes BreadcrumbList JSON-LD, helping AI understand site hierarchy and navigation paths.
- Sitemap contains 1,521 URLs covering deep content — The sitemap includes a large number of URLs, indicating extensive content coverage that can be discovered by crawlers.
- Core content does not require JavaScript rendering — All sampled pages deliver their primary textual content in the initial HTML, making them accessible to AI crawlers that do not execute JavaScript.
Track cancer.ie across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.