AI Site Grade
htdhealth.com — AI Site Grade
HTD Health has OpenAI domain verification but zero AI-specific robots.txt rules, and cold LLM knowledge about the brand is substantially fabricated.
HTD Health's AI visibility is undermined by a bare robots.txt, a missing llms.txt, hallucinated cold LLM knowledge, duplicate about pages, and no structured data for its consultancy services.
- Findings
- 10
- Evidence checks
- 21
- Completed
- 30 May 2026
Analysis
---
HTD Health has an OpenAI domain verification TXT record but zero AI-specific robots.txt rules — and the cold LLM knowledge about the brand is substantially fabricated
The site's DNS includes openai-domain-verification, anthropic-domain-verification, and apple-domain-verification TXT records, signaling deliberate AI-ecosystem enrollment. Yet the robots.txt is a bare Yoast-generated file with a single User-agent: * Disallow: rule and no mention of GPTBot, ClaudeBot, Google-Extended, or any other AI crawler. The llms.txt returns a 404. This is a site that registered with AI providers but never configured how they should crawl it.
Crawler Access
Every major AI bot — GPTBot, ClaudeBot, OAI-SearchBot, Google-Extended, PerplexityBot, ChatGPT-User, anthropic-ai, Applebot-Extended — receives a 200 with the full 318KB HTML payload, identical to a browser baseline. The sole exception is Bytespider (ByteDance's crawler), which gets a 403 from Cloudflare. The site runs on Cloudflare behind WP Engine hosting, with strong HSTS and security headers. No JS-rendering risk: the homepage delivers 1,200+ words of visible text on a plain GET.
Cold-Knowledge Gap
When queried cold, the LLM described HTD Health as having an "HTD Pulse" platform for real-time patient monitoring, an "FDA-cleared digital therapeutics" portfolio, a "Health Tech Studio" model, and a founding story involving "former clinicians and engineers from Columbia University." None of these claims appear anywhere on htdhealth.com. The site describes a strategy-and-technology consultancy that has shipped 250+ projects across care delivery, MedTech, SaaS, and payors — founded in 2016 by Zach Markin (chemical engineering background, not clinical). The LLM hallucinated a product portfolio and origin story that the site does not corroborate. This gap means AI-generated descriptions of HTD Health in zero-shot contexts are likely inaccurate.
Content & Schema Posture
The homepage uses WebPage, BreadcrumbList, and WebSite schema with SearchAction — standard Yoast SEO output. No Organization, ProfessionalService, LocalBusiness, or Service schema is present, which is a missed signal for a consultancy with five office locations (NYC, Nashville, Portland, Warsaw, Lodz). The site has a substantial glossary (~2,900 words of healthcare-tech definitions) and a blog with 50+ articles dating back to 2022, but no FAQPage, no comparison tables, and no structured data for the case studies. The /homepage-copy-test/ page is indexed and in the sitemap — a duplicate of the homepage with a different canonical URL, creating a thin-content indexed page.
External Signals
External search results for HTD Health are sparse. No Reddit threads, no independent review sites, no press coverage surfaced in search. The only external signals are the client logos on the homepage (Boston Children's Hospital, 1upHealth, Athenahealth, Johnson & Johnson, Butterfly Network, Zus Health, CLEAR) and five case studies. The testimonials from named executives at Boston Children's Hospital, Pip Care, 1upHealth, 20/20 On-Site, and Fitzroy Health are the strongest third-party credibility signals — but none of these are independently linkable or verifiable from off-domain sources.
Duplicate Content & Stale Pages
Two "About Us" pages exist: /about-us/ (515 words, updated Feb 2025) and /about-us-htd/ (232 words, updated Jan 2025). Both are indexed with separate canonicals and different content — the former mentions 250+ projects and 5 offices, the latter mentions 160+ projects and 2 offices. This fragmentation dilutes the brand story AI crawlers encounter. The /vive-2023/ page (a conference landing page from 2023) is still in the sitemap. The /glossary/ page has a meta description that reads "A privacy policy for HTD" — a copy-paste error from the privacy policy template.
Findings
No AI-specific robots.txt rules despite domain verification with AI providers High
The site has openai-domain-verification, anthropic-domain-verification, and apple-domain-verification TXT records, but robots.txt only has a single User-agent: * rule with no mention of GPTBot, ClaudeBot, or other AI crawlers. llms.txt returns 404.
What to change: Add explicit allow/disallow rules for GPTBot, ClaudeBot, Google-Extended, and other AI crawlers in robots.txt. Create an llms.txt file at the root with a summary of the site and links to key pages.
Cold LLM knowledge fabricates product portfolio and origin story High
When queried cold, an LLM described HTD Health as having an 'HTD Pulse' platform for real-time patient monitoring, FDA-cleared digital therapeutics, a 'Health Tech Studio' model, and a founding by former clinicians and engineers from Columbia University. None of these claims appear on the site. The actual site describes a consultancy founded in 2016 by Zach Markin with a chemical engineering background.
What to change: Publish a clear, detailed About page and case studies that explicitly state the company's history, services, and product offerings to ground AI knowledge in accurate content.
No Organization or Service schema for a multi-office consultancy Medium
The homepage uses WebPage, BreadcrumbList, and WebSite schema but lacks Organization, ProfessionalService, LocalBusiness, or Service schema. The site lists five office locations (NYC, Nashville, Portland, Warsaw, Lodz) and offers strategy, design, and engineering services, but these are not marked up.
What to change: Add Organization schema with office locations, and Service schema for each service line (strategy, design, engineering).
Two indexed About Us pages with conflicting content Medium
Both /about-us/ (515 words, 250+ projects, 5 offices) and /about-us-htd/ (232 words, 160+ projects, 2 offices) are indexed with separate canonicals. This dilutes the brand story and confuses crawlers.
What to change: Consolidate into a single About page with accurate, consistent information. Redirect /about-us-htd/ to /about-us/.
Stale conference landing page from 2023 still in sitemap Low
The /vive-2023/ page, a conference landing page from 2023, remains in the sitemap and is indexed, providing outdated content.
What to change: Remove the page from the sitemap and either update it for the current year or redirect it to a relevant current page.
Duplicate homepage copy test page indexed and in sitemap Medium
The /homepage-copy-test/ page is a duplicate of the homepage with different canonical, creating a thin-content indexed page.
What to change: Remove the page from the sitemap and add a noindex tag, or redirect it to the canonical homepage.
Glossary page has incorrect meta description from privacy policy Low
The /glossary/ page's meta description reads 'A privacy policy for HTD', a copy-paste error from the privacy policy template.
What to change: Update the meta description to accurately describe the glossary content.
Near-zero external search presence and backlinks Medium
Web searches for HTD Health, its products, and client names returned zero results. No Reddit threads, review sites, or press coverage were found. The only external signals are client logos on the homepage.
What to change: Develop a PR and content marketing strategy to earn backlinks and mentions from reputable healthcare and tech publications.
Bytespider crawler blocked by Cloudflare Low
Bytespider (ByteDance's crawler) receives a 403 from Cloudflare, while all other major AI bots are allowed. This may limit visibility in ByteDance's AI products.
What to change: If visibility in ByteDance ecosystems is desired, allow Bytespider in Cloudflare WAF rules.
No FAQPage or comparison schema on content pages Low
The site has a glossary and blog but no FAQPage schema or comparison tables with structured data, missing opportunities for rich results.
What to change: Add FAQPage schema to relevant blog posts and glossary terms where appropriate.
What's working
- All major AI crawlers allowed and receive full HTML — GPTBot, ClaudeBot, OAI-SearchBot, Google-Extended, PerplexityBot, and others receive a 200 with the full 318KB HTML payload, identical to a browser. No JS-rendering risk.
- Strong security headers and Cloudflare protection — The site uses Cloudflare with HSTS and security headers, providing good security posture without blocking legitimate bots.
- Substantial glossary with ~2,900 words of healthcare-tech definitions — The /glossary/ page contains extensive, unique content that can serve as a knowledge resource for AI crawlers.
- Credible client logos and testimonials from named executives — The homepage features logos of Boston Children's Hospital, 1upHealth, Athenahealth, Johnson & Johnson, and others, plus testimonials from named executives at those organizations.
- Five case studies available on site — The /resources/case-studies/ page lists five case studies, providing detailed project examples that can be used for AI grounding.
- Blog with 50+ articles dating back to 2022 — The site has a blog with over 50 articles, providing a steady stream of content that can be indexed by AI crawlers.
- Domain verified with OpenAI, Anthropic, and Apple — DNS TXT records show openai-domain-verification, anthropic-domain-verification, and apple-domain-verification, indicating proactive enrollment in AI ecosystems.
- Sitemap submitted and contains 80 URLs — The sitemap at /sitemap_index.xml returns 200 and lists 80 URLs, helping crawlers discover content.
Track htdhealth.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.