AI Site Grade
groupon.com — AI Site Grade
Groupon's robots.txt blocks major AI crawlers while Cloudflare serves them full HTML, and a fabricated 'SumUp acquisition' narrative persists in LLM knowledge with no structured brand-verification content to correct it.
Groupon's AI visibility is undermined by contradictory crawler access, missing schema on key pages, and a hallucinated acquisition story that the site does nothing to counter.
- Findings
- 9
- Evidence checks
- 22
- Completed
- 30 May 2026
Analysis
Groupon AI-Visibility Audit
Groupon's robots.txt explicitly blocks GPTBot, ClaudeBot, Google-Extended, Applebot-Extended, and Bytespider with Disallow: / — yet every one of those bots receives a 200 status with full HTML content (1.8 MB) from Cloudflare, approximately 2x the browser baseline size, meaning the robots.txt prohibition is the only barrier and it is trivially ignored by any crawler that does not honor it.
Crawler Access
The robots.txt uses Cloudflare's managed content-signal framework (Content-Signal: search=yes,ai-train=no) for the wildcard User-agent: * but then individually disallows every major AI crawler. This creates a contradictory posture: the wildcard rule grants Allow: / with an AI-training opt-out, while the specific bot rules say Disallow: /. In practice, compare_bot_access shows all AI bots receive the same Cloudflare-served page as a browser — no 403, no UA-based blocking at the edge. The llms.txt returns a 404 (Groupon's generic error page with noindex,nofollow). The sitemap index is massive (30+ sub-sitemaps, 75,000+ URLs in the first three alone), covering deals, goods, getaways, and local categories, but no AI-friendly content map exists.
Cold-Knowledge Gap
A frontier LLM queried cold about Groupon describes it as a "daily deal" pioneer founded in 2008 by Andrew Mason, notes it was "acquired by SumUp in 2024 for $126 million," and characterizes it as struggling with "declining revenue and customer trust." No evidence of a SumUp acquisition exists anywhere on Groupon's site or in web search results. The investor relations page (investor.groupon.com) presents Groupon as an independent publicly traded company with quarterly earnings, a board of directors, and SEC filings. The "About" page (groupon.com/articles/about) lists recent news items from 2024-2026 including a CEO announcement and Q3 2024 earnings — all consistent with an independent public company. The cold model's "acquired by SumUp" narrative is fabricated, and the site does nothing to correct this hallucination because it lacks any structured brand-verification content that AI engines could reliably retrieve.
Schema Posture
The homepage carries Organization and WebSite schema with SearchAction and app store links. Individual deal pages are richer — the architecture boat tour page includes ProductGroup, FAQPage, TouristAttraction, BreadcrumbList, and Review schema with aggregate ratings. However, the merchant landing page (/merchant) and the about page (/articles/about) contain zero JSON-LD schema. The merchant page is a critical acquisition funnel — "Join Over 1 Million Merchants" — yet has no Organization, Product, or FAQPage markup. The about page has no schema at all despite being the canonical source for brand positioning.
External Signals
The DNS TXT records reveal an anthropic-domain-verification token, confirming Groupon has proactively verified its domain with Anthropic — a signal the company is aware of AI crawler presence. Multiple google-site-verification tokens (7 distinct entries) and an apple-domain-verification record indicate broad platform verification. The site runs on Cloudflare (A records point to 104.18.12.36/13.36) with x-frame-options: DENY, SAMEORIGIN and strict-transport-security headers.
Surprising Findings
The "About" page redirects from about.groupon.com to groupon.com/articles/about with a broken canonical URL: https://pull.production.service./articles/about — a staging-domain reference leaked into production. The page also contains a duplicate H1 structure (three <h1> elements) and a "Groupon Rebrands as G-SPOT" news item dated April 1, 2025, which is an April Fools' joke sitting alongside legitimate earnings announcements with no disclaimer. The cold LLM's hallucinated "SumUp acquisition" story is the most dangerous gap: Groupon's actual site presents a healthy, independent public-company narrative, but the model's prior is a fictional acquisition-and-decline story that live retrieval may not fully override given the robots.txt barriers.
Findings
Robots.txt disallows all major AI crawlers while Cloudflare serves full HTML High
Groupon's robots.txt explicitly disallows GPTBot, ClaudeBot, Google-Extended, Applebot-Extended, and Bytespider with Disallow: /, yet all receive a 200 status with full HTML content (1.8 MB) from Cloudflare. The prohibition is only honored by compliant crawlers, creating a contradictory posture.
What to change: Remove the Disallow: / rules for AI crawlers from robots.txt, or implement UA-based blocking at the edge to enforce the intended access policy.
llms.txt returns 404 with noindex,nofollow Medium
The llms.txt endpoint returns a 404 error page with noindex,nofollow, providing no AI-friendly content map or guidance for language models.
What to change: Create a valid llms.txt file that lists key resources for AI crawlers, such as the sitemap index and about page.
Cold LLM knowledge fabricates a SumUp acquisition that does not exist High
A frontier LLM queried cold about Groupon claims it was acquired by SumUp in 2024 for $126 million, but no evidence of this acquisition exists on Groupon's site or in web search results. The investor relations and about pages present Groupon as an independent public company.
What to change: Add structured data (e.g., Organization schema with founding date, stock ticker) and a clear brand narrative on the about page to correct hallucinated narratives.
Merchant landing page has zero JSON-LD schema Medium
The merchant page (/merchant) contains no structured data markup despite being a critical acquisition funnel with claims like 'Join Over 1 Million Merchants'.
What to change: Add Organization, Product, and FAQPage schema to the merchant page to improve AI understanding and visibility.
About page has no structured data Medium
The about page (/articles/about) contains zero JSON-LD schema despite being the canonical source for brand positioning and company information.
What to change: Add Organization schema with founding date, description, and social profiles to the about page.
About page has broken canonical URL pointing to staging domain Medium
The about page's canonical URL is set to 'https://pull.production.service./articles/about', a staging-domain reference leaked into production.
What to change: Fix the canonical URL to point to the correct production URL: https://www.groupon.com/articles/about.
About page contains duplicate H1 elements Low
The about page has three <h1> elements, which is a structural HTML issue that can confuse crawlers and degrade SEO.
What to change: Ensure only one <h1> per page; use <h2> and <h3> for subheadings.
April Fools' joke appears alongside legitimate earnings announcements without disclaimer Low
The about page includes a 'Groupon Rebrands as G-SPOT' news item dated April 1, 2025, which is an April Fools' joke, mixed with real earnings announcements and no disclaimer.
What to change: Add a clear disclaimer or separate April Fools' content from legitimate news to avoid confusion for AI crawlers.
No AI-friendly content map despite large sitemap Medium
The sitemap index contains 30+ sub-sitemaps and 75,000+ URLs, but there is no dedicated AI-friendly content map or llms.txt to guide language models.
What to change: Create an llms.txt file and consider a smaller, curated sitemap for AI crawlers.
What's working
- Domain verified with Anthropic via DNS TXT record — Groupon has an anthropic-domain-verification TXT record, indicating proactive domain verification with Anthropic for AI crawler trust.
- Multiple platform verification tokens present — DNS records include 7 google-site-verification tokens and an apple-domain-verification record, showing broad platform verification.
- Deal pages include rich structured data — Individual deal pages feature ProductGroup, FAQPage, TouristAttraction, BreadcrumbList, and Review schema with aggregate ratings, enhancing AI understanding.
- Homepage includes Organization and WebSite schema — The homepage carries Organization and WebSite schema with SearchAction and app store links, providing basic brand information to crawlers.
- Cloudflare security headers in place — The site uses Cloudflare with x-frame-options: DENY, SAMEORIGIN and strict-transport-security headers, indicating good security posture.
Track groupon.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.