AI Site Grade

instacart.com — AI Site Grade

Instacart's AI-visibility infrastructure lags behind its AI-native strategy: no llms.txt, no sitemap, no AI-bot directives in robots.txt, and stale cold knowledge.

Instacart positions itself as 'grocery infrastructure for agentic AI' but lacks basic AI-visibility infrastructure like llms.txt, sitemap, and AI-bot directives, while cold knowledge remains 2-3 years out of date.

Findings: 8
Evidence checks: 23
Completed: 30 May 2026

Analysis

Instacart's homepage and blog reveal a company that has already integrated deeply with AI ecosystems (Claude, Gemini) and positions itself as "grocery infrastructure for agentic AI" — yet the site's technical AI-visibility setup is stuck in a pre-2023 era, with no `llms.txt`, no sitemap, and no AI-bot directives in `robots.txt`.

Crawler Access

All major AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Applebot-Extended, Bytespider, anthropic-ai — receive a 200 status with ~1MB of full HTML content from the homepage, served via AWS CloudFront + nginx. No UA-based blocking exists. The robots.txt (last updated Feb 2026) contains rules for Googlebot, Bingbot, and other search crawlers but mentions zero AI-specific user-agents. No Disallow directives target any AI crawler. The llms.txt returns a 404. The sitemap.xml also returns a 404 — no XML sitemap exists at all, meaning AI crawlers must discover pages through link traversal alone.

Cold-Knowledge Gap

The LLM's cold knowledge describes Instacart as a "grocery delivery and pickup platform" founded in 2012, IPO'd in 2023, with mentions of worker classification lawsuits and tipping controversies. This is stale by 2-3 years. The actual site reveals a fundamentally different company: Instacart now brands itself as "the leading grocery technology company" and "grocery infrastructure for agentic AI." The blog documents integrations with Claude (April 2026) and Gemini (May 2026), an Instaleap acquisition for global expansion, $1B+ in ads revenue (2025), Caper AI-powered smart carts, FoodStorm prepared-food ordering, Storefront Pro enterprise e-commerce, and a Physical AI partnership with NVIDIA. The cold model knows nothing about these enterprise and AI-native pivots.

Schema Posture

The homepage carries a rich FAQPage JSON-LD schema with 6 well-structured Q&A entries covering delivery mechanics, pricing, replacements, and contactless delivery. The /grocery-delivery page adds a BreadcrumbList schema. The /instacart-plus page also uses FAQPage. However, no Organization, WebSite, LocalBusiness, or Product schemas appear on the main consumer-facing pages. The blog uses Article schema on individual posts (e.g., the Claude integration post has full Article, WebPage, BreadcrumbList, Organization, and Person schemas). The enterprise/retail pages carry zero schema markup.

Content & Structure

The homepage is a JS-rendered React SPA but serves ~956 words of visible text to crawlers, including a full FAQ section. The heading structure is flat: one H1 ("Order groceries for delivery or pickup today") followed by dozens of H2/H3 elements. Category pages (e.g., /categories/316-food) contain 2,600+ words with buying guides and FAQ content. The deals page (/store/hub/deals_tab) returns 0 words of visible text — a pure JS shell that renders nothing to crawlers.

External Signals

The DNS TXT records reveal an openai-domain-verification token and an anthropic-domain-verification-6m2fwg token, confirming the company has actively verified its domain with both OpenAI and Anthropic for API/partnership use. The blog explicitly states Instacart is "available in Claude" and integrated with Gemini. The newsroom shows recent partnerships with Costco (France/Spain launch), Ace Hardware, ALDI U.S., and Fareway, plus the Instaleap acquisition expanding into Europe and Latin America.

Findings

No llms.txt file published High
The llms.txt endpoint returns a 404, meaning AI crawlers have no structured guidance for discovering key pages or understanding site content.
What to change: Publish an llms.txt file listing key pages (homepage, blog, newsroom, enterprise pages) with brief descriptions.
No XML sitemap available High
The sitemap.xml returns a 404, forcing AI crawlers to discover pages solely through link traversal, which can miss deep or new content.
What to change: Generate and submit an XML sitemap covering all important pages, including blog posts, category pages, and enterprise content.
Robots.txt lacks AI crawler directives High
The robots.txt file contains rules for Googlebot, Bingbot, and other search crawlers but mentions no AI-specific user-agents (GPTBot, ClaudeBot, etc.), leaving their access unmanaged.
What to change: Add explicit directives for AI crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) to allow or disallow specific paths as needed.
Cold knowledge is 2-3 years out of date High
The LLM's cold knowledge describes Instacart as a grocery delivery platform with IPO and worker lawsuits, missing the company's pivot to 'grocery technology company' and AI integrations with Claude, Gemini, and NVIDIA.
What to change: Publish authoritative content (blog posts, press releases, enterprise pages) that clearly state the company's current positioning and AI partnerships, and ensure they are indexed and linked from the homepage.
Deals page renders as empty JS shell High
The /store/hub/deals_tab page returns 0 words of visible text to crawlers, meaning AI crawlers cannot extract any content from it.
What to change: Implement server-side rendering or pre-rendering for the deals page to ensure content is visible to crawlers.
No Organization or WebSite schema on main pages Medium
The homepage and key consumer pages lack Organization, WebSite, LocalBusiness, or Product schemas, reducing structured data signals for AI crawlers.
What to change: Add Organization and WebSite JSON-LD schemas to the homepage and key landing pages.
Enterprise pages carry zero schema markup Medium
Pages like /instacart-retail and /tech-innovation have no structured data, missing opportunities to convey business relationships and technology capabilities to AI crawlers.
What to change: Add Organization, WebSite, and relevant schema types (e.g., SoftwareApplication for tech products) to enterprise pages.
Flat heading structure on homepage Low
The homepage uses one H1 followed by dozens of H2/H3 elements without a clear hierarchical outline, which can confuse crawlers about content importance.
What to change: Restructure headings to follow a logical hierarchy (H1 > H2 > H3) with appropriate nesting.

What's working

All major AI crawlers receive full HTML content — The homepage returns 200 status with ~1MB of HTML to all tested AI crawlers, with no UA-based blocking.
Rich FAQPage JSON-LD on homepage — The homepage includes a well-structured FAQPage schema with 6 Q&A entries covering key service details.
Blog posts use comprehensive Article schema — Individual blog posts (e.g., Claude integration) include Article, WebPage, BreadcrumbList, Organization, and Person schemas.
Domain verified with OpenAI and Anthropic — DNS TXT records include openai-domain-verification and anthropic-domain-verification tokens, confirming active verification for API/partnership use.
Category pages contain substantial text content — Pages like /categories/316-food have 2,600+ words with buying guides and FAQ content, providing rich material for AI crawlers.
BreadcrumbList schema on /grocery-delivery — The /grocery-delivery page includes a BreadcrumbList schema, aiding crawlers in understanding site structure.
FAQPage schema on Instacart+ page — The /plus page also uses FAQPage schema, providing structured Q&A about the subscription service.
Newsroom and blog provide current company updates — The newsroom and blog contain recent announcements about partnerships, acquisitions, and AI integrations, helping to update cold knowledge over time.

Track instacart.com across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand

Analysis

Crawler Access

Cold-Knowledge Gap

Schema Posture

Content & Structure

External Signals

Findings

No llms.txt file published High

No XML sitemap available High

Robots.txt lacks AI crawler directives High

Cold knowledge is 2-3 years out of date High

Deals page renders as empty JS shell High

No Organization or WebSite schema on main pages Medium

Enterprise pages carry zero schema markup Medium

Flat heading structure on homepage Low

What's working

Track instacart.com across AI search