AI Site Grade

basis.com — AI Site Grade

Basis.com serves full HTML to AI crawlers while returning 403 to all human visitors, creating an inverted access model where only machines can read the site.

Basis.com blocks all human visitors with a 403 Forbidden while serving rich, rendered content to AI crawlers, but suffers from stale sitemap data, placeholder schema descriptions, and inconsistent AI bot access.

Findings
11
Evidence checks
44
Completed
30 May 2026

Analysis

Basis.com: A JS-Rendered Site That Blocks All Human Visitors While Welcoming AI Crawlers

Basis.com serves a fully rendered, content-rich homepage to AI crawlers (Google-Extended, ClaudeBot, PerplexityBot, OAI-SearchBot, Applebot-Extended all return 200 with ~124KB of HTML) while returning a 403 Forbidden to every standard browser request, including the Wayback Machine's own fetch. This is not a bot-blocking strategy — it is a human-blocking strategy, and it creates a bizarre inversion where only machines can read the site.

Crawler Access

The robots.txt is permissive (User-agent: * Disallow:) with no AI-specific rules, a Crawl-delay: 10, and a sitemap reference. The llms.txt returns 403 — not a 404, but actively blocked. The sitemap index is accessible to AI bots and reveals thousands of URLs across post-sitemap, page-sitemap, news-sitemap, and landing_page-sitemap files. However, GPTBot receives a 429 (rate-limited), and Bytespider is blocked with 403, creating inconsistent access across AI crawlers. The site runs on WordPress (WP Engine) behind Cloudflare, with a JS-heavy frontend that requires rendering — the raw HTML from bot fetches is truncated at the GTM script tag, meaning the visible text content is loaded client-side.

Content & Schema

The homepage (via Wayback snapshot) identifies Basis as "The Intelligent Operating System for Autonomous Advertising" — a unified platform for programmatic, search, social, and CTV media buying. The site was formerly Centro, rebranded to Basis Technologies in 2020. JSON-LD schema is present with Organization, WebSite, WebPage, and BreadcrumbList types, correctly naming the entity as "Basis Technologies" with Chicago address, phone, and social links. The page contains strong answer-format signals: a comparison table (Forrester TEI metrics: 35% productivity increase, 43% time reduction), customer testimonials, and a channel-by-channel breakdown. However, the schema description field contains a generic placeholder ("serves industries including healthcare, retail, and finance") that does not match the actual advertising-technology positioning.

Cold-Knowledge Gap

The LLM correctly identifies Basis Technologies as a programmatic advertising platform (formerly Centro) serving agencies and brands, and knows about the 2020 rebrand and Forrester Wave recognition. This is well-aligned with the site's actual content — no major gap exists. The prior knowledge is accurate and current, likely because the site's content is well-indexed by AI crawlers despite being invisible to humans.

External Signals

The DNS TXT records reveal anthropic-domain-verification, adobe-idp-site-verification, and multiple google-site-verification tokens, indicating active engagement with AI and enterprise platforms. The site has 1,589 Wayback captures dating to 1997, reflecting a long web history. The sitemap reveals stale content: many blog posts and news items have lastmod dates of 1970-01-01 (Unix epoch default), and the news section contains articles from 2010-2013 referencing the "Centro" brand, creating a fragmented historical record for AI crawlers.

Findings

  1. All human visitors receive 403 Forbidden High

    Basis.com returns a 403 Forbidden to every standard browser request, including the Wayback Machine, while serving full HTML to AI crawlers. This blocks human access entirely.

    What to change: Remove the blanket 403 block for human user-agents, or implement a proper bot detection mechanism that does not block all browsers.

  2. llms.txt returns 403 instead of 404 Medium

    The llms.txt file at /llms.txt returns a 403 Forbidden, not a 404, indicating active blocking of this AI-friendly resource.

    What to change: Serve a valid llms.txt file with a summary of the site's content for AI crawlers, or return a 404 if not implemented.

  3. GPTBot receives 429 rate limit on homepage Medium

    GPTBot is rate-limited (429) when accessing the homepage and sitemap, while other AI bots like Google-Extended get 200. This creates inconsistent AI crawler access.

    What to change: Remove rate limiting for GPTBot or ensure consistent access policies across all AI crawlers.

  4. Bytespider blocked with 403 Medium

    Bytespider receives a 403 Forbidden, blocking this AI crawler entirely.

    What to change: Allow Bytespider access if the site wants to be indexed by that crawler, or explicitly disallow in robots.txt.

  5. Content loaded client-side via JavaScript High

    The raw HTML from bot fetches is truncated at the GTM script tag, meaning the visible text content is loaded client-side. AI crawlers that do not execute JavaScript may see incomplete content.

    What to change: Implement server-side rendering (SSR) or pre-rendering to ensure all content is available in the initial HTML response.

  6. Sitemap entries with Unix epoch default lastmod dates Medium

    Many sitemap entries have lastmod dates of 1970-01-01, indicating missing or incorrect metadata. This can confuse AI crawlers about content freshness.

    What to change: Update sitemap entries with accurate lastmod dates, or remove the field if not maintained.

  7. News section contains articles from 2010-2013 under old brand name Low

    The news sitemap includes articles from 2010-2013 referencing the 'Centro' brand, creating a fragmented historical record for AI crawlers.

    What to change: Remove or update outdated news articles, or redirect old Centro URLs to current Basis content.

  8. Schema description contains generic placeholder text Medium

    The JSON-LD schema description field contains a generic placeholder ('serves industries including healthcare, retail, and finance') that does not match the actual advertising-technology positioning.

    What to change: Update the schema description to accurately reflect Basis Technologies' advertising automation platform.

  9. Sitemap index returns 403 to human browsers Medium

    The sitemap index and sub-sitemaps return 403 Forbidden to standard browsers, though they are accessible to Google-Extended. This prevents human discovery of the site structure.

    What to change: Allow public access to sitemap files so that humans and all crawlers can discover the site structure.

  10. No llms.txt file available Low

    The site does not serve a valid llms.txt file, missing an opportunity to provide a structured summary for AI crawlers.

    What to change: Create and serve an llms.txt file with a concise summary of the site's content and key pages.

  11. No indexed pages found in web search results High

    Multiple web searches for 'site:basis.com' and related queries returned zero results, indicating poor search engine indexing despite AI crawler access.

    What to change: Ensure the site is accessible to search engine crawlers and that content is indexable; consider submitting sitemaps to search consoles.

What's working

  • Permissive robots.txt with no AI-specific blocks — The robots.txt allows all user-agents with no disallowed paths, and includes a sitemap reference, ensuring AI crawlers are not blocked at the directive level.
  • Full HTML content served to major AI crawlers — Google-Extended, ClaudeBot, PerplexityBot, OAI-SearchBot, and Applebot-Extended all receive a 200 response with ~124KB of HTML, providing rich content for AI indexing.
  • JSON-LD schema with Organization, WebSite, WebPage, and BreadcrumbList — The homepage includes structured data with correct entity name, address, phone, and social links, helping AI crawlers understand the organization.
  • Comparison table and customer testimonials provide answer-format content — The page contains a Forrester TEI comparison table with specific metrics (35% productivity increase, 43% time reduction) and customer testimonials, which are strong signals for AI-generated answers.
  • LLM prior knowledge accurately reflects the site's positioning — The LLM correctly identifies Basis Technologies as a programmatic advertising platform (formerly Centro) with accurate details about the 2020 rebrand and Forrester Wave recognition, indicating good alignment between site content and AI knowledge.
  • Multiple domain verification tokens for AI and enterprise platforms — DNS TXT records include anthropic-domain-verification, adobe-idp-site-verification, and google-site-verification tokens, indicating active engagement with AI and enterprise platforms.
  • 1,589 Wayback captures dating to 1997 — The site has a long web history with extensive Wayback Machine captures, providing a rich historical record that can aid AI understanding of the brand's evolution.
  • Sitemap index and sub-sitemaps accessible to Google-Extended — The sitemap index and sub-sitemaps (page, post, news) are accessible to Google-Extended, revealing thousands of URLs for indexing.

Track basis.com across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand