AI Site Grade

rethinkfirst.com — AI Site Grade

RethinkFirst's AI Code of Ethics page is noindexed and blocked from GPTBot by Cloudflare rate-limiting, while the site lacks AI-bot directives in robots.txt and has zero JSON-LD schema on sub-pages.

RethinkFirst's AI visibility is undermined by a noindexed AI ethics page, fragmented content across subdomains, stale knowledge in LLMs, and missing schema on all sub-pages.

Findings
7
Evidence checks
24
Completed
30 May 2026

Analysis

I have all the data I need. Here's the audit.

The AI Code of Ethics page — the site's most AI-relevant page — is noindex, nofollow and blocked from GPTBot by Cloudflare rate-limiting

RethinkFirst's robots.txt contains zero AI-bot directives. No mention of GPTBot, ClaudeBot, PerplexityBot, Google-Extended, or any other AI crawler. The wildcard rule (User-agent: *) only disallows /wp-admin/ and a long list of scraper tools from the early 2000s. Despite this, compare_bot_access on the homepage shows all major AI bots receive a 200 with full HTML content — GPTBot, ClaudeBot, OAI-SearchBot, PerplexityBot, Google-Extended, and ChatGPT-User all get the same 227KB response as a browser. The sole exception is Bytespider (ByteDance), which gets a 403 from Cloudflare. The site runs on Cloudflare with HSTS preload, strict security headers, and Azure DNS.

Cold-Knowledge Gap

Asked cold, a frontier LLM describes RethinkFirst as a "behavioral health technology company" focused on "mental health and substance use treatment" with "tools for clinical decision support, patient engagement, and analytics." This is stale and partially wrong. The actual site positions itself around neurodiversity, ABA therapy, autism, and behavioral health across four verticals: employers (RethinkCare), educators (RethinkEd), providers (RethinkBH), and health plans (RethinkFutures). The cold model knows nothing about the sub-brand architecture, the 650M+ clinical data points claim, the 99% prediction precision, or the AI Code of Ethics. It also incorrectly recalls the company was "originally known as Rethink Behavioral Health" — the site says it was founded in 2007 as Rethink, focused on autism treatment training.

Schema Posture

The homepage carries two JSON-LD blocks: a WebSite and an Organization schema. The Organization schema includes sameAs links to Facebook, X, Instagram, and YouTube, plus a ContactPoint with phone number. However, no sub-page has any JSON-LD schema at all — the solutions page, about page, AI Code of Ethics page, team page, news page, and resources page all return zero schema types. There are no FAQPage, Article, Product, BreadcrumbList, or Person schemas anywhere on the site. The AI Code of Ethics page, despite being the most AI-relevant content on the domain, carries a <meta name="robots" content="noindex, nofollow, noimageindex, nosnippet"> tag — meaning Google and other search engines are explicitly told not to index or show snippets of the company's own AI ethics framework.

Content Fragmentation and Stale Pages

The site maintains a live /careers/ page and a /careers-old/ page — both indexed, both in the sitemap, both returning 200. The old careers page still lists "over 350 people" while the about page says "almost 500 team members." The llms.txt file is present and well-structured, pointing to the four sub-brand domains (rethinkcare.com, rethinked.com, rethinkbehavioralhealth.com, rethinkfutures.com) — but the resources it links to are hosted on those separate subdomains, creating a fragmented content graph that AI crawlers must traverse across multiple domains to fully understand the offering. The sitemap index contains 8 sub-sitemaps with ~108+ URLs, but many resource URLs point to external subdomains rather than the main domain.

Findings

  1. AI Code of Ethics page is noindex, nofollow, and blocked from GPTBot High

    The AI Code of Ethics page, the site's most AI-relevant content, carries a meta robots tag of 'noindex, nofollow, noimageindex, nosnippet' and returns a 429 rate-limit error for GPTBot, preventing AI crawlers from accessing and indexing the company's own AI ethics framework.

    What to change: Remove the noindex, nofollow meta tag from the AI Code of Ethics page and whitelist GPTBot in Cloudflare to allow access.

  2. Robots.txt contains zero AI-bot directives High

    The robots.txt file has no rules for GPTBot, ClaudeBot, PerplexityBot, Google-Extended, or any other AI crawler, leaving AI access unmanaged despite Cloudflare rate-limiting affecting GPTBot on key pages.

    What to change: Add explicit allow/disallow directives for major AI crawlers (GPTBot, ClaudeBot, Google-Extended, PerplexityBot) in robots.txt.

  3. Zero JSON-LD schema on any sub-page High

    Only the homepage has JSON-LD schema (WebSite and Organization). All other pages — solutions, about, AI Code of Ethics, team, news, resources — lack any structured data, missing opportunities for rich results and AI understanding.

    What to change: Add appropriate JSON-LD schema types (e.g., Article, FAQPage, BreadcrumbList, Person) to all sub-pages.

  4. Cold LLM knowledge is stale and incomplete Medium

    A frontier LLM describes RethinkFirst as a general behavioral health company, missing the neurodiversity/ABA focus, sub-brand architecture, 650M+ clinical data points, 99% prediction precision, and AI Code of Ethics. The model also incorrectly recalls the company was originally named 'Rethink Behavioral Health'.

    What to change: Improve on-page content to clearly articulate the sub-brand structure, key differentiators, and AI ethics stance; consider publishing an llms.txt with detailed summaries.

  5. Duplicate live and old careers pages with conflicting team size claims Medium

    Both /careers/ and /careers-old/ return 200 and are indexed. The old page states 'over 350 people' while the about page says 'almost 500 team members', creating inconsistency.

    What to change: Remove or redirect /careers-old/ to /careers/ and ensure team size numbers are consistent across the site.

  6. Content fragmented across multiple subdomains Medium

    The llms.txt file points to four separate subdomains (rethinkcare.com, rethinked.com, rethinkbehavioralhealth.com, rethinkfutures.com), forcing AI crawlers to traverse multiple domains to understand the full offering, which dilutes authority and complicates indexing.

    What to change: Consolidate key content onto the main domain or ensure cross-domain linking and schema to signal entity relationships.

  7. Bytespider (ByteDance) blocked by Cloudflare Low

    Bytespider receives a 403 from Cloudflare, preventing any access to the site. While less critical than other AI bots, this blocks potential visibility in ByteDance's AI products.

    What to change: Allow Bytespider access if the site wishes to be included in ByteDance's AI training or search products.

What's working

  • llms.txt file is present and well-structured — The site hosts an llms.txt file that lists four sub-brand domains, providing a starting point for AI crawlers to discover the broader content ecosystem.
  • Homepage has WebSite and Organization JSON-LD schema — The homepage includes valid JSON-LD for WebSite and Organization, with sameAs links to social profiles and a contact point, helping search engines understand the brand.
  • Major AI bots receive 200 on homepage — GPTBot, ClaudeBot, OAI-SearchBot, PerplexityBot, Google-Extended, and ChatGPT-User all get a 200 response with full HTML content on the homepage, indicating no blanket blocking.
  • AI Code of Ethics page has substantial content — The AI Code of Ethics page contains 917 words of detailed policy content, which is valuable for AI crawlers if access and indexing issues are resolved.
  • Sitemap index with multiple sub-sitemaps — The sitemap index contains 8 sub-sitemaps with over 108 URLs, providing a comprehensive map of the site for crawlers.

Track rethinkfirst.com across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand