AI Site Grade
mills-reeve.com — AI Site Grade
Mills & Reeve's AI crawlers get full access but find no structured guidance — no llms.txt, no LegalService schema, and a flat 8,900-URL sitemap that buries the firm's deepest expertise.
Mills & Reeve grants unrestricted AI crawler access but lacks llms.txt, LegalService schema, and sitemap indexing, causing AI models to miss the firm's full sector breadth and key differentiators.
- Findings
- 10
- Evidence checks
- 23
- Completed
- 30 May 2026
Analysis
Mills & Reeve — AI-Visibility Audit
The site's most consequential gap is not a blocking problem but a structuring and discoverability problem: every AI crawler gets full 200 access with real content, yet the site has no llms.txt, no FAQ schema, no WebSite or LegalService schema, and its 8,900+ URLs are dumped into a single flat sitemap with no sub-sitemap index — a scale that will cause AI crawlers to sample sparsely and miss the firm's deepest expertise.
Crawler Access
All 11 tested bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, anthropic-ai, Bytespider, Applebot-Extended, Perplexity-User, and a browser baseline) return identical 200 status, identical 341KB payload, identical Cloudflare server. No UA-based blocking, no JS shell, no CAPTCHA. The robots.txt is a bare 57 bytes — only Disallow: /umbraco/ and /cdn-cgi/ — with zero AI-bot-specific rules. This is permissive but also means no crawler is guided toward high-value content or away from thin pages. The llms.txt returns 404, so there is no AI-friendly content map at all.
Content & Schema Posture
The homepage carries Organization schema with name, address, phone, email, and social profiles — solid basics. Every subpage carries only BreadcrumbList schema. No LegalService, Attorney, FAQPage, WebSite, or Article schema exists anywhere on the site. The homepage has no FAQ section, no comparison tables, and no structured Q&A. The 8,900+ URL sitemap is a single flat XML file (2.2MB) with no sitemap index — a known crawl-efficiency problem for large sites. Content itself is rich: the publications hub lists 2,960 items with deep sector/service filtering, and the news section shows 493 items with recent deal announcements (e.g., advising on a £150M sale, University of Cambridge quantum collaboration). But none of this content is surfaced to AI crawlers via structured markup or an llms.txt.
Cold-Knowledge Gap
The LLM prior knows Mills & Reeve as a UK firm with offices in Cambridge, London, Norwich, Birmingham, Manchester, and Leeds, strong in education and healthcare, with a University of Cambridge partnership and Legal 500 / Chambers rankings. The actual site reveals a broader and more differentiated firm than the model knows: the firm serves 14+ sectors including food and agribusiness, energy and infrastructure, life sciences, sport/entertainment/media, and government — none of which appear in the cold knowledge. The site also prominently features Platinum Investors in People status (achieved by only 6-8% of assessed organisations), Financial Times Innovative Lawyers ranking (Top 50 in Europe), and a 90% client recommendation rate from 800 surveyed clients. None of these differentiators are in the model's prior.
External Signals
External search results for reviews, Reddit threads, and press rankings returned near-zero results — the firm has minimal third-party citation footprint in the open web. The DNS TXT records reveal a complex tech stack (Kentico CMS on Azure, Mimecast, Freshservice, DocuSign, 1Password, Zoom, Canva, Nitro) but no signals of active PR or review-generation programs. The careers page shows strong employer branding (Stonewall Diversity Champion, Disability Confident Leader, Menopause Friendly Accredited), but this employer-side strength is invisible to AI models that only know the firm's legal practice areas.
Findings
No llms.txt file for AI crawler guidance High
The site returns a 404 for llms.txt, providing no AI-friendly content map. This forces crawlers to discover content via sitemap and internal links, reducing the chance that deep expertise pages are indexed.
What to change: Create an llms.txt file listing key pages (sectors, services, publications) to guide AI crawlers to high-value content.
No LegalService or Attorney schema on any page High
The site only uses Organization schema on the homepage and BreadcrumbList on subpages. No LegalService, Attorney, FAQPage, or Article schema exists, limiting structured data visibility for AI models.
What to change: Add LegalService schema to practice area pages and Attorney schema to lawyer profiles. Add FAQPage schema to any Q&A content.
Single flat sitemap with 8,900+ URLs and no sitemap index High
The sitemap contains over 8,900 URLs in one flat file (2.2MB) with no sitemap index. This scale causes AI crawlers to sample sparsely, missing deep content.
What to change: Split the sitemap into logical sub-sitemaps (e.g., sectors, services, publications, news) and create a sitemap index file.
AI models unaware of 14+ sectors served Medium
LLM prior knowledge only includes education and healthcare sectors, but the site serves food/agribusiness, energy/infrastructure, life sciences, sport/entertainment/media, government, and more. These are not surfaced via schema or llms.txt.
What to change: Add structured data (e.g., LegalService schema with sector tags) and include sector pages in llms.txt to improve AI knowledge.
Key differentiators missing from AI knowledge Medium
The site highlights Platinum Investors in People status, Financial Times Innovative Lawyers Top 50 in Europe, and 90% client recommendation rate, but none appear in LLM prior knowledge.
What to change: Include these differentiators in llms.txt and add structured data (e.g., Organization schema with awards) to relevant pages.
Near-zero third-party citation footprint Medium
Web searches for reviews, Reddit mentions, and press rankings returned almost no results. The firm has minimal external signals that AI models can use to validate authority.
What to change: Encourage client reviews on platforms like Google, Trustpilot, and legal directories. Publish press releases and thought leadership to generate citations.
No FAQ or Q&A content on the site Medium
The homepage and subpages lack FAQ sections or structured Q&A. This misses an opportunity to capture featured snippets and AI-generated answers.
What to change: Add FAQPage schema to pages with common client questions, or create a dedicated FAQ section.
Robots.txt lacks AI-bot-specific directives Low
The robots.txt only disallows /umbraco/ and /cdn-cgi/ with no rules for AI crawlers. While permissive, it misses the chance to guide crawlers to high-value content and away from thin pages.
What to change: Add Allow/Disallow rules for AI bots to prioritize key sections (e.g., Allow: /sectors-and-services/, Disallow: /umbraco/).
News and publications lack Article schema Medium
The news and publications pages contain rich content but no Article or NewsArticle schema, reducing their visibility in AI-generated summaries.
What to change: Add Article or NewsArticle schema to individual news and publication pages.
Strong employer branding not surfaced to AI Low
The careers page showcases Stonewall Diversity Champion, Disability Confident Leader, and Menopause Friendly Accredited status, but these are not in LLM prior knowledge and lack structured data.
What to change: Add structured data (e.g., Organization schema with awards) to the careers page and include key differentiators in llms.txt.
What's working
- All AI crawlers receive full 200 access with real content — All 11 tested bots return identical 200 status and full HTML payload. No UA-based blocking, JS shells, or CAPTCHAs. This ensures AI models can access all public content.
- Organization schema present on homepage with key details — The homepage includes Organization schema with name, address, phone, email, and social profiles, providing basic structured data for AI models.
- Deep publications hub with 2,960 items and sector filtering — The publications section offers extensive content with filtering by sector and service, demonstrating deep expertise across multiple practice areas.
- Active news section with recent deal announcements — The news section contains 493 items including high-value deal announcements (e.g., £150M sale, University of Cambridge quantum collaboration), providing fresh content for AI models.
- BreadcrumbList schema on all subpages — Every subpage includes BreadcrumbList schema, aiding navigation understanding for crawlers.
- Permissive robots.txt with no AI-bot blocking — The robots.txt only disallows admin paths, allowing all AI crawlers to access the entire public site.
- Strong employer branding content on careers page — The careers page highlights multiple accreditations (Stonewall, Disability Confident, Menopause Friendly) that differentiate the firm as an employer.
- Cloudflare CDN with consistent performance — The site uses Cloudflare CDN, ensuring fast global delivery and protection, which benefits crawler performance.
Track mills-reeve.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.