AI Site Grade

ncfe.org.uk — AI Site Grade

NCFE's site has zero structured data across every page examined, despite being a 175-year-old regulated awarding body that certificates hundreds of thousands of learners annually — a complete schema vacuum that leaves AI engines to reconstruct the organisation's identity from thi

NCFE's site lacks all structured data, has a broken about page, a JS-rendered qualification search, and a noindexed FAQ hub, while AI crawlers have full access but find no authoritative markup to ground their understanding.

Findings: 9
Evidence checks: 23
Completed: 30 May 2026

Analysis

I have enough data to write a thorough audit. Let me compile the findings.

NCFE's site has zero structured data across every page examined, despite being a 175-year-old regulated awarding body that certificates hundreds of thousands of learners annually — a complete schema vacuum that leaves AI engines to reconstruct the organisation's identity from third-party noise rather than authoritative markup.

Crawler Access

All major AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Bytespider, Applebot-Extended — receive 200 OK with full content (134,892 bytes, identical to browser baseline) on the homepage. No UA-based blocking, no Cloudflare challenge walls, no JS-gating. The robots.txt is minimal (84 bytes) — only disallows /umbraco/ (the CMS backend) and points to a sitemap. No AI-bot-specific directives exist. The llms.txt returns 404. The sitemap at /sitemapxml/ contains 985 URLs and is well-formed.

Schema Posture

Every page examined — homepage, about page, qualifications page, sector specialisms, FAQ pages, technical education — has zero JSON-LD or any structured data of any type. No Organization, EducationalOrganization, FAQPage, Course, WebSite, or BreadcrumbList schema is present. The FAQ page at /customer-and-learner-support/faqs/ is marked noindex,nofollow and contains no FAQPage schema despite having a clear Q&A structure. The qualification search page renders only 6 words of visible text — a JS-rendered shell invisible to crawlers that do not execute JavaScript.

Cold-Knowledge Gap

The LLM prior knows NCFE as "Northern Council for Further Education," founded 1848, a UK awarding organisation, and mentions the CACHE brand prominently. The actual site tells a different story: NCFE now describes itself simply as "NCFE" (letters no longer an acronym), acquired CACHE in 2015, Active IQ in 2022, and Skills Forward in 2021. The site positions itself as "the third biggest technical and vocational awarding organisation in the UK" and an "educational charity" — claims absent from the LLM's cold knowledge. The LLM also references a "2023 criticism over quality assurance of remote assessments" and a "Skills for Life campaign" — neither of which appears anywhere on the live site.

External Signals

DNS records show Cloudflare hosting, Microsoft 365 mail (Outlook), and integrations with SendGrid, Smartsheet, Miro, Zapier, and Eventsforce. The site links to a portal.ncfe.org.uk subdomain for customer logins. External social links (LinkedIn, Twitter/X, YouTube, Facebook, Instagram) are present in the footer. The /about/ path (without hyphen) returns a 404 — a broken internal link pattern that may confuse crawlers.

Content Contradictions

The homepage describes NCFE as "the third biggest technical and vocational awarding organisation in the UK" — a specific market-positioning claim that no structured data reinforces. The about page states a goal of "one million learners by 2030." The history page details acquisitions of CACHE, Active IQ, and Skills Forward, yet the LLM prior only knows CACHE. The FAQ hub page is noindex,nofollow, meaning AI crawlers are explicitly told not to index the site's richest Q&A content. The qualification search is a JS shell with negligible server-rendered text, making the core product catalogue opaque to non-JS crawlers.

Findings

Zero JSON-LD or structured data on any page High
Every page examined — homepage, about, qualifications, sector specialisms, FAQ, technical education — has no JSON-LD or any structured data. No Organization, EducationalOrganization, FAQPage, Course, WebSite, or BreadcrumbList schema exists.
What to change: Add Organization schema to the homepage, FAQPage schema to the FAQ page, Course schema to qualification pages, and BreadcrumbList to all pages.
Qualification search page renders as a JS shell High
The qualification search page at /qualification-search/ contains only 6 words of visible text, indicating a JavaScript-rendered interface that is invisible to crawlers that do not execute JavaScript.
What to change: Server-render the qualification search results or provide a static HTML fallback with links to individual qualification pages.
FAQ hub page is noindex, blocking AI crawlers from Q&A content High
The FAQ hub page at /customer-and-learner-support/faqs/ has a noindex,nofollow meta tag, preventing AI crawlers from indexing the site's richest Q&A content.
What to change: Remove the noindex directive from the FAQ hub page and add FAQPage schema.
About page at /about/ returns 404 Medium
The path /about/ (without hyphen) returns a 404 error, which may confuse crawlers and users who expect a standard about URL.
What to change: Redirect /about/ to /about-ncfe/ or serve a proper page at that path.
llms.txt file returns 404 Medium
The llms.txt file, which provides a curated list of important URLs for LLMs, is missing (404).
What to change: Create an llms.txt file listing key pages like about, qualifications, and FAQ.
LLM cold knowledge contradicts site content Medium
The LLM prior knows NCFE as 'Northern Council for Further Education' and mentions CACHE brand, but the site now uses just 'NCFE' and has acquired Active IQ and Skills Forward. The LLM also references a 2023 criticism and Skills for Life campaign not found on the site.
What to change: Add Organization schema with legal name, founding date, and description to align LLM knowledge with site content.
No AI-bot-specific directives in robots.txt Low
The robots.txt does not contain any directives for AI crawlers like GPTBot, ClaudeBot, or Google-Extended, leaving them unrestricted but also unguided.
What to change: Consider adding explicit allow/disallow rules for AI crawlers to guide them to important content.
MFA FAQ page lacks FAQPage schema Medium
The multi-factor authentication FAQ page has clear Q&A structure but no FAQPage schema, missing an opportunity for rich results.
What to change: Add FAQPage schema to all FAQ pages.
Web searches for NCFE return zero results Medium
Multiple web searches for NCFE-related queries returned zero results, indicating poor external visibility or search engine indexing issues.
What to change: Investigate search engine indexing and improve SEO fundamentals.

What's working

All major AI crawlers receive full content with 200 OK — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others get 200 OK with full HTML content on the homepage, with no blocking or JS-gating.
Robots.txt is minimal and allows all AI crawlers — The robots.txt only disallows /umbraco/ and points to a sitemap, with no AI-bot-specific blocks.
Sitemap contains 985 URLs and is well-formed — The sitemap at /sitemapxml/ lists 985 URLs, providing good coverage for crawlers.
Site uses Cloudflare for performance and security — Cloudflare hosting provides CDN, DDoS protection, and performance benefits.
Footer includes links to major social platforms — LinkedIn, Twitter/X, YouTube, Facebook, and Instagram links are present in the footer, aiding external signals.
About page provides detailed history and acquisitions — The about page at /about-ncfe/ contains 1231 words detailing the organisation's history, acquisitions, and charitable status.

Track ncfe.org.uk across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand

Analysis

Crawler Access

Schema Posture

Cold-Knowledge Gap

External Signals

Content Contradictions

Findings

Zero JSON-LD or structured data on any page High

Qualification search page renders as a JS shell High

FAQ hub page is noindex, blocking AI crawlers from Q&A content High

About page at /about/ returns 404 Medium

llms.txt file returns 404 Medium

LLM cold knowledge contradicts site content Medium

No AI-bot-specific directives in robots.txt Low

MFA FAQ page lacks FAQPage schema Medium

Web searches for NCFE return zero results Medium

What's working

Track ncfe.org.uk across AI search