AI Site Grade
lawhive.co.uk — AI Site Grade
Lawhive's /llms.txt returns a 579KB HTML page instead of a plaintext LLM content map, and its /inform/ pages deliver zero visible content to any crawler.
Lawhive has strong crawler access and schema on key pages, but its /llms.txt is broken, /inform/ pages are empty shells, and cold LLM knowledge lacks its SRA-regulated law firm status.
- Findings
- 9
- Evidence checks
- 26
- Completed
- 30 May 2026
Analysis
Lawhive's /llms.txt returns a 579KB HTML page — the full Next.js app shell — instead of a plaintext LLM content map, making it the single largest AI-visibility gap on the site.
Crawler Access
All major AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Applebot-Extended, Bytespider — receive a 200 with identical 1MB payload from Vercel edge. No UA-based blocking exists. The robots.txt is minimal: a single User-Agent: * rule disallowing /intake, /api/, and /onboarding/ paths, with no AI-bot-specific directives at all. The DNS TXT records include an anthropic-domain-verification token, confirming Lawhive has proactively registered with Anthropic for Claude access.
The Broken /llms.txt and Hollow Inform Pages
The /llms.txt endpoint returns a 579KB HTML document (the full Next.js app shell with CSS-in-JS and font preloads) rather than a plaintext LLM content map. This is worse than a 404 — an AI crawler fetching this path gets an unparseable blob of framework boilerplate. Separately, the /inform/ pages (contentious-probate, mirror-will, etc.) and /knowledge-archive all return 200 with zero visible text — empty shells with noindex meta tags and no canonical URLs. These pages are listed in the sitemap but deliver no content to any crawler, human or bot.
Cold-Knowledge Gap
The LLM knows Lawhive as a 2021-founded legaltech startup with an AI-powered platform connecting users to vetted solicitors, backed by Google Ventures and TQ Ventures, with ~$100M raised. The site itself positions as "Your modern law firm" — an SRA-regulated law firm (not a marketplace), emphasising fixed fees, 300+ lawyers, and 30,000+ clients. The gap: the LLM describes Lawhive as a "platform" and "marketplace," while the site asserts it is a law firm that acquired Woodstock Legal Services. The site's acquisition of Woodstock and its SRA-regulated status are absent from cold LLM knowledge.
Schema and Content Posture
The homepage carries Organization and WebSite schema with a London address. The legal-area pages (conveyancing, divorce) add Service and FAQPage schema with well-structured Q&A. The FAQs page itself has a rich FAQPage schema with 20+ questions — but is noindex, nofollow, meaning Google and AI crawlers cannot surface it in search results. The BreadcrumbList schema appears on sub-pages but is missing from the homepage. No Product, LocalBusiness, or Review schema is used despite prominent testimonial carousels.
External Signals
The site links to Trustpilot and claims "Recommended by 30,000+ satisfied clients." The about page cites press coverage from Fortune, TechCrunch, and Sky News regarding the $60M raise and Woodstock acquisition. DNS records show verification tokens for Google, Apple, Facebook, Stripe, and Anthropic — indicating broad platform integration but no structured external citation markup on the site itself.
Findings
/llms.txt returns 579KB HTML app shell instead of plaintext LLM content map High
The /llms.txt endpoint delivers a full Next.js HTML document with CSS-in-JS and font preloads, making it unparseable by AI crawlers. This is the single largest AI-visibility gap on the site.
What to change: Replace the /llms.txt endpoint with a plaintext file listing key URLs and content summaries for LLM consumption, following the llms.txt standard.
/inform/ pages and /knowledge-archive return zero visible text High
Pages like /inform/contentious-probate, /inform/mirror-will, and /knowledge-archive return 200 with no visible content and noindex meta tags. They are listed in the sitemap but deliver nothing to crawlers.
What to change: Populate these pages with substantive content or remove them from the sitemap and add proper canonical URLs.
Cold LLM knowledge lacks Lawhive's SRA-regulated law firm status and Woodstock acquisition Medium
LLMs describe Lawhive as a 'platform' or 'marketplace', but the site asserts it is an SRA-regulated law firm that acquired Woodstock Legal Services. This gap could cause AI-generated summaries to misrepresent the business.
What to change: Add structured data (e.g., LegalService schema) and publish a press release or authoritative page detailing the SRA regulation and acquisition to improve LLM knowledge.
FAQs page is noindex, nofollow despite rich FAQPage schema Medium
The /faqs page contains a well-structured FAQPage schema with 20+ questions but is blocked from search engines and AI crawlers via noindex, nofollow meta tags.
What to change: Remove the noindex, nofollow directives from the FAQs page to allow search engines and AI crawlers to index and surface the content.
BreadcrumbList schema missing from homepage Low
BreadcrumbList schema appears on sub-pages but is absent from the homepage, reducing structured navigation context for AI crawlers.
What to change: Add BreadcrumbList schema to the homepage to provide clear navigation hierarchy.
No Review or AggregateRating schema despite testimonial carousels Medium
The site prominently displays testimonials and claims 'Recommended by 30,000+ satisfied clients' but does not use Review or AggregateRating structured data.
What to change: Add AggregateRating and Review schema to the homepage and relevant pages to enable rich snippets and AI-friendly review data.
No LocalBusiness schema despite physical London address Low
The homepage includes Organization schema with a London address but does not use LocalBusiness schema, which could improve local AI visibility.
What to change: Add LocalBusiness schema to the homepage to enhance local search and AI understanding.
No Product schema for legal services Low
Legal-area pages describe services like conveyancing and divorce but lack Product schema, which could help AI crawlers understand service offerings.
What to change: Add Product schema to legal-area pages to describe services with pricing and availability.
No structured citation markup for press coverage Low
The about page mentions press coverage from Fortune, TechCrunch, and Sky News but does not use Citation or Article schema to link to those sources.
What to change: Add Citation or Article schema to press mentions to provide verifiable references for AI crawlers.
What's working
- All major AI crawlers allowed with no UA-based blocking — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others receive 200 responses with full content. No AI-specific disallow rules exist in robots.txt.
- Anthropic domain verification token present in DNS — DNS TXT records include an anthropic-domain-verification token, indicating proactive registration for Claude access.
- Legal-area pages have Service and FAQPage schema with well-structured Q&A — Pages like /legal-area/conveyancing and /legal-area/divorce include Service and FAQPage schema with detailed Q&A, aiding AI understanding.
- Homepage has Organization and WebSite schema with London address — The homepage includes Organization and WebSite structured data, providing basic business info and site identity to AI crawlers.
- DNS verification tokens for Google, Apple, Facebook, Stripe, and Anthropic — TXT records show verification for multiple platforms, indicating broad integration and trust signals.
- Sitemap includes /inform/ pages and /knowledge-archive — Despite empty content, these pages are listed in the sitemap, showing intent to have them indexed.
Track lawhive.co.uk across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.