AI Site Grade

milkandtweed.com — AI Site Grade

Milk & Tweed is invisible to AI: zero cold knowledge, no external signals, and a 403 block on ClaudeBot.

The site claims award-winning status and 70+ clients, but AI models have zero prior knowledge, external search returns no results, and ClaudeBot is blocked.

Findings
10
Evidence checks
31
Completed
30 May 2026

Analysis

Milk & Tweed: The AI crawler sees a site that the open web barely acknowledges

The most striking finding is a total cold-knowledge vacuum: a frontier LLM queried about "Milk & Tweed agency" returns zero verifiable information — the brand is absent from the model's training data entirely — despite the site claiming "award winning" status, a 20-person team, an acquisition, and a portfolio of 70+ clients.

Crawler Access

All major AI crawlers reach the homepage with full 200 status and identical byte payload (438 KB), except ClaudeBot which receives a 403 block — the only bot singled out. The robots.txt contains no AI-specific directives; it is a bare Yoast-generated file disallowing only a single plugin JSON path. No llms.txt exists (404). The site runs on nginx behind a caching layer (DigitalOcean droplet at 178.62.15.231, UK-based host UKHost4U). The 438 KB page size is large but consistent across all user-agents, indicating no JS-shell gating for crawlers — though the Huxley acquisition blog post returns only 3 words of visible text from a plain GET, suggesting Elementor-rendered content that may be thin for AI parsers despite the full HTML payload.

Cold-Knowledge Gap

The LLM knows nothing about Milk & Tweed — not its location (Chippenham, Wiltshire), not its services (graphic design, web development, digital marketing), not its notable clients (CoppaFeel!, Goughs Solicitors, Fonix, Carl Todd), not its 2023 acquisition of Huxley Digital, not its tree-planting initiative. The site describes itself as formed from a merger of Boson Web (est. 2003) and Milk & Tweed, yet this origin story is entirely absent from the model's prior. The gap between the brand's self-presentation (award-winning, full-service, 20+ staff) and what AI engines can cold-recall is total.

Schema Posture

The site uses Yoast-generated JSON-LD with WebSite, WebPage, BreadcrumbList, and ImageObject types — standard but minimal. The marketing page includes a FAQPage schema with structured Q&A about digital marketing services. The design and web service pages also contain FAQ sections but these are rendered as visible HTML only, without corresponding FAQPage markup. No Organization schema with logo, social profiles, or founding date is present. No LocalBusiness or Service schema exists despite the agency having a physical studio address in Chippenham. The about page has a dateModified of "2026-05-08" — a future date that signals a data-entry error.

External Signals

External search returns zero indexed results for the brand across multiple queries — no Trustpilot page (despite the site prominently displaying "Our clients rate us Excellent on Trustpilot" with five embedded testimonials), no press coverage, no directory listings, no LinkedIn company page in search results. The newsroom lists six press releases from early-to-mid 2022, the most recent being June 2022 — over two years stale. The blog is actively maintained (recent posts on TikTok SEO, zero-click searches, LinkedIn marketing) but none of this content surfaces in external search. The site's sitemap uses http:// URLs while the canonical homepage uses https:// — a mixed-protocol inconsistency that may dilute crawl equity.

Findings

  1. Zero AI cold knowledge of Milk & Tweed High

    A frontier LLM queried about Milk & Tweed returns no verifiable information. The brand is absent from training data despite claims of awards, a 20-person team, an acquisition, and 70+ clients.

    What to change: Build a strong external backlink profile, publish authoritative content, and claim listings on trusted directories to increase the brand's footprint in training corpora.

  2. ClaudeBot receives 403 Forbidden High

    ClaudeBot is the only major AI crawler blocked from the homepage, receiving a 403 response while all other bots get 200. This prevents Claude from indexing the site.

    What to change: Remove the 403 block for ClaudeBot or ensure it is allowed via server configuration.

  3. Zero external search results for the brand High

    Multiple web searches for the brand name, domain, and related terms return zero indexed results. No Trustpilot page, press coverage, directory listings, or LinkedIn company page appear in search results.

    What to change: Improve SEO fundamentals, build backlinks, claim business listings, and ensure the site is indexed by Google.

  4. No Organization or LocalBusiness schema Medium

    The site lacks Organization, LocalBusiness, and Service schema markup despite having a physical studio address and offering multiple services. This limits AI understanding of the business.

    What to change: Add Organization schema with logo, social profiles, founding date, and address. Add LocalBusiness schema for the Chippenham studio.

  5. About page has future dateModified Medium

    The about page's JSON-LD includes a dateModified of 2026-05-08, a future date that indicates a data-entry error and may confuse crawlers.

    What to change: Correct the dateModified to the actual last modification date.

  6. Acquisition blog post returns only 3 words of visible text Medium

    The blog post about acquiring Huxley Digital returns only 3 words of visible text from a plain GET, suggesting Elementor-rendered content that may be thin for AI parsers despite a large HTML payload.

    What to change: Ensure that key content is rendered as visible text, not hidden behind JavaScript or lazy-loaded elements.

  7. Service pages lack FAQPage schema for visible FAQs Medium

    The design and web service pages contain FAQ sections rendered as HTML but without corresponding FAQPage markup, missing an opportunity for rich results.

    What to change: Add FAQPage schema to all pages with visible FAQ content.

  8. Newsroom content is over two years old Low

    The newsroom lists six press releases from early-to-mid 2022, with the most recent being June 2022. No recent updates signal inactivity.

    What to change: Publish recent news or remove the newsroom section to avoid appearing outdated.

  9. Sitemap uses http:// URLs while site uses https:// Low

    The sitemap lists URLs with http:// protocol, but the canonical homepage uses https://. This mixed-protocol inconsistency may dilute crawl equity.

    What to change: Update the sitemap to use https:// URLs consistently.

  10. No llms.txt file published Low

    The site does not provide an llms.txt file, missing an opportunity to guide AI crawlers to key content.

    What to change: Create an llms.txt file listing important pages for AI crawlers.

What's working

  • All major AI crawlers except ClaudeBot are allowed — The site allows access to GPTBot, Google-Extended, and other major AI crawlers with 200 status, ensuring broad AI visibility.
  • FAQPage schema present on marketing page — The marketing page includes structured FAQPage schema with Q&A about digital marketing services, enabling rich results.
  • Consistent HTML payload across all bots — All allowed bots receive the same 438 KB HTML payload, indicating no JS-shell gating or cloaking for crawlers.
  • Blog is actively maintained with recent posts — The blog contains recent posts on topics like TikTok SEO and zero-click searches, showing ongoing content creation.
  • Tree-planting initiative page exists — The site has a dedicated page for a tree-planting initiative, which can serve as a positive brand signal.

Track milkandtweed.com across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand