AI Site Grade
rates.ca — AI Site Grade
Rates.ca blocks all AI crawlers with Cloudflare challenges, leaving only a 78KB llms.txt as the sole content bridge for AI models.
Rates.ca's Cloudflare challenge blocks every major AI crawler, its llms.txt is the only AI-accessible content, and the site suffers from brand identity fragmentation and zero structured data.
- Findings
- 10
- Evidence checks
- 44
- Completed
- 30 May 2026
Analysis
Cloudflare Challenge Blocks Every AI Crawler — llms.txt Is the Only Content Bridge
Every major AI crawler (GPTBot, ClaudeBot, Google-Extended, PerplexityBot, OAI-SearchBot, ChatGPT-User, Bytespider, Applebot-Extended) receives a 403 Cloudflare challenge page at https://rates.ca and all subpages tested. The robots.txt contains no AI-bot rules at all — only a generic User-agent: * blocklist for Drupal admin paths and static assets. The site is built on Drupal, hosted behind Cloudflare, and renders entirely via JavaScript: the Wayback Machine snapshots show rich heading structures (FAQ sections, comparison tables, award rankings) but the raw HTML returns fewer than 100 words of visible text. The sitemap.xml also returns 403.
llms.txt as the Sole AI Content Channel
The site has a 78KB llms.txt at https://rates.ca/llms.txt that returns 200 and contains an exhaustive directory of 200+ URLs covering car insurance (by city, province, coverage type, car make), home insurance, mortgage rates, credit cards, business insurance, and travel insurance. Each entry includes a one-sentence description. This file is the only machine-readable content bridge for AI models. The DNS records confirm an anthropic-domain-verification TXT record, indicating active Anthropic integration. However, the llms.txt links point to pages that all return 403 to crawlers — so an AI model reading the file can describe what the site offers but cannot fetch any actual page content to verify or enrich that description.
Cold-Knowledge Gap: Brand Identity Mismatch
Cold LLM knowledge describes rates.ca as a "Canadian online comparison platform" launched in 2014, rebranded from RateSupermarket.ca, and owned by the same parent as Ratehub.ca. The site itself brands as RATESDOTCA (visible in the Wayback snapshot's H1: "Get a better rate." and meta description). The llms.txt header says "Rates.ca" but the homepage and subpages consistently use "RATESDOTCA" as the brand name. This dual-identity creates fragmentation: an AI model trained on the cold knowledge knows "rates.ca" as a Ratehub sibling, but the site's own content calls itself "RATESDOTCA" — a name that appears nowhere in the cold knowledge. The cold knowledge also mentions a "Rates Rewards" program and lead-sharing complaints, neither of which appear in the llms.txt or Wayback content.
Schema and Structured Data Absence
The Wayback snapshots of the homepage, auto insurance page, mortgage page, and credit cards page all show zero JSON-LD schema of any type. Despite having FAQ sections (the mortgage page has 8 FAQ items), comparison tables (credit card awards, insurance company rankings), and clear answer-format signals, no FAQPage, Product, Organization, or WebSite schema is present. The llms.txt is the only structured content the site exposes to AI — and it is a flat text list with no schema markup, no hierarchy beyond headings, and no machine-readable metadata about pricing, coverage areas, or partner providers.
External Signal Void
The search tool returned zero results for any query involving "rates.ca", "RATESDOTCA", or related terms across multiple search attempts — no Reddit threads, no reviews, no press mentions, no blog citations. This may reflect a search tool limitation, but combined with the Cloudflare wall and the absence of any external backlink profile visible in the investigation, the site appears to have minimal off-domain footprint that AI models could surface.
Findings
Cloudflare challenge blocks every major AI crawler High
All tested AI crawlers (GPTBot, ClaudeBot, Google-Extended, PerplexityBot, OAI-SearchBot, ChatGPT-User, Bytespider, Applebot-Extended) receive a 403 Cloudflare challenge page at the homepage and all subpages. The robots.txt contains no AI-bot rules, only a generic blocklist for Drupal admin paths.
What to change: Allow AI crawlers through Cloudflare by creating a bypass rule or serving static HTML versions of key pages. Add explicit Allow directives for AI bots in robots.txt.
llms.txt links point to pages that return 403 to crawlers High
The llms.txt lists over 200 URLs with descriptions, but every linked page returns a 403 Cloudflare challenge when accessed by AI crawlers. AI models can read the file but cannot fetch actual page content to verify or enrich the descriptions.
What to change: Ensure that pages referenced in llms.txt are accessible to AI crawlers, either by allowing them through Cloudflare or by serving static HTML versions.
Sitemap returns 403, blocking crawler discovery High
The sitemap.xml at rates.ca returns a 403 Cloudflare challenge, preventing search engine and AI crawlers from discovering the site's URL structure.
What to change: Allow access to sitemap.xml for all crawlers, or serve it from a path that is not blocked.
No JSON-LD schema on any tested page High
The homepage, auto insurance page, mortgage page, and credit cards page all contain zero JSON-LD schema of any type, despite having FAQ sections, comparison tables, and clear answer-format signals.
What to change: Add relevant JSON-LD schema types (FAQPage, Product, Organization, WebSite) to all key pages to improve AI understanding and rich result eligibility.
Brand identity fragmentation between rates.ca and RATESDOTCA Medium
Cold LLM knowledge identifies the site as rates.ca, a Ratehub sibling, but the site's own content brands as RATESDOTCA. The llms.txt header uses 'Rates.ca' while the homepage and subpages consistently use 'RATESDOTCA'. This dual-identity creates fragmentation for AI models.
What to change: Align brand identity across all channels: use a single consistent brand name (RATESDOTCA or Rates.ca) in llms.txt, homepage, and subpages.
JavaScript-rendered content invisible to crawlers High
The site renders entirely via JavaScript: raw HTML returns fewer than 100 words of visible text, while Wayback snapshots show rich heading structures. This prevents AI crawlers from indexing the actual content.
What to change: Implement server-side rendering or static HTML snapshots for key pages to ensure content is accessible to crawlers.
No external backlinks or mentions found in search Medium
Multiple web searches for 'rates.ca', 'RATESDOTCA', and related terms returned zero results, indicating minimal off-domain footprint that AI models could surface.
What to change: Build external backlinks and mentions through PR, partnerships, and content marketing to improve AI visibility.
Robots.txt lacks AI bot directives Medium
The robots.txt file contains no rules for AI crawlers, only a generic User-agent: * blocklist for Drupal admin paths and static assets. This leaves AI crawlers without explicit guidance.
What to change: Add explicit Allow directives for AI crawlers (e.g., GPTBot, ClaudeBot) to key pages, and consider adding a Crawl-Delay directive.
llms.txt is a flat text list with no hierarchy or metadata Medium
The llms.txt file is a flat text list of URLs with one-sentence descriptions, lacking hierarchy, schema markup, or machine-readable metadata about pricing, coverage areas, or partner providers.
What to change: Enhance llms.txt with hierarchical structure, metadata (e.g., pricing, coverage areas), and consider using llms-full.txt for detailed content.
Cold knowledge lacks RATESDOTCA brand identity Medium
Cold LLM knowledge describes rates.ca as a Ratehub sibling, but the site's own content brands as RATESDOTCA, a name that appears nowhere in the cold knowledge. This mismatch may confuse AI models.
What to change: Ensure consistent brand identity across all online presences and update external listings to reflect the primary brand name.
What's working
- llms.txt is accessible and contains 200+ URLs — The site has a 78KB llms.txt at /llms.txt that returns 200 and contains an exhaustive directory of 200+ URLs covering all major product categories, providing a machine-readable content bridge for AI models.
- Anthropic domain verification TXT record present — DNS records include an anthropic-domain-verification TXT record, indicating active integration with Anthropic's AI tools.
- Wayback snapshots show rich content structure — Archived snapshots reveal rich heading structures, FAQ sections, comparison tables, and award rankings, indicating the site has valuable content that could be made AI-accessible.
Track rates.ca across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.