AI Site Grade
olsamgroup.com — AI Site Grade
Olsam Group's sitemap poisons AI crawlers with 22 dead news URLs while the live site offers thin content and zero portfolio proof.
Olsam Group's sitemap lists 22 dead news pages that return 404, wasting AI crawl budget, while the live site lacks portfolio details, case studies, and structured data that AI engines need to surface rich answers.
- Findings
- 12
- Evidence checks
- 31
- Completed
- 30 May 2026
Analysis
Olsam Group — AI-Visibility Audit
The site's sitemap actively lists 22 URLs in the /news/ section that all return HTTP 404, meaning the sitemap is poisoning AI crawlers with dead links — every news article and academy page the sitemap advertises is gone, leaving only the homepage, about page, process page, contact page, and their Chinese-language mirrors as live content.
Crawler Access
All major AI bots — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, Bytespider, Applebot-Extended — receive HTTP 200 with identical full HTML content as a browser. No UA-based blocking exists. The robots.txt is a bare Yoast-generated catch-all (User-agent: * with no disallows) that never mentions a single AI bot by name. No llms.txt exists (404). The site runs on Cloudflare behind LiteSpeed cache with PHP 8.2, and the homepage is aggressively cached (Cache-Control: public, max-age=604800). The 367KB page weight includes a full JS/CSS payload, but the server-rendered HTML contains all visible text — no JS-rendering risk for AI crawlers.
Content Graveyard in the Sitemap
The sitemap_index.xml feeds two sub-sitemaps (news-sitemap.xml, academy-sitemap.xml) that together list over 30 URLs under /news/ and /academy/. Every single one tested — including forbes-olsam-group/, business-insider-olsam-group/, amazon-aggregator-olsam/, olsam-acquires-dwarfs/, olsam-announces-new-investment/, meet-olsam-a-u-k-based-acquirer/, and were-buying-great-assets-at-20-to-30-lower-prices/ — returns HTTP 404 with noindex. The sitemap was last crawled by the sitemap parser as having these URLs, meaning any AI crawler following the sitemap wastes crawl budget on a dead press archive. The only live pages are the homepage, /our-story/, /the-process/, /contact-us/, /privacy-policy/, /cookie-policy/, /thank-you/, and their /cn/ Chinese equivalents.
Cold-Knowledge Gap
The LLM knows Olsam as a UK-based Amazon FBA aggregator that raised $300M+ from BlackRock and Victory Park Capital, focuses on home/garden/sports/outdoor brands, and faced layoffs and restructuring in 2023. The site itself says nothing about portfolio brands, funding amounts, investors, layoffs, or any specific financial figures. The homepage mentions "As Seen In" with logos (Forbes, TechCrunch, Business Insider, Sifted) but the corresponding press pages are all 404. The /our-story/ page describes the team's Amazon and M&A backgrounds but omits any mention of the $165M+ funding rounds that external press covered. The site presents a polished acquisition funnel (the "6x6 Process") but offers zero proof of portfolio performance, case studies, or brand names — a gap the LLM's cold knowledge fills with generic aggregator-industry context instead.
Schema and Answer Signals
JSON-LD schema is present via Yoast but limited to WebPage, WebSite, Organization, BreadcrumbList, and ImageObject types. No Product, FAQPage, Review, Article, or ItemList schema exists anywhere. The homepage uses H1 headings for value propositions ("Why Olsam", "We Love Ecommerce Businesses That Have:") but the content is a 259-word thin page with no FAQ, no comparison tables, and no structured data that would help AI engines extract a rich answer snippet. The /the-process/ page has a clear 6-step numbered structure that could be marked up as HowTo schema but is plain HTML only. The contact form collects revenue brackets and sales-channel percentages — a lead-gen form, not an answer surface.
Findings
Sitemap lists 22 dead news URLs that return 404 High
The sitemap advertises 22 URLs under /news/ and /academy/ that all return HTTP 404 with noindex. AI crawlers following the sitemap waste crawl budget on a dead press archive.
What to change: Remove all 404 URLs from the sitemap. Either restore the press pages or delete them from the sitemap to stop poisoning AI crawlers.
No llms.txt file published Medium
The site returns 404 for /llms.txt, missing an opportunity to guide AI crawlers to key pages and provide a structured summary for LLM consumption.
What to change: Publish an llms.txt file that lists the most important pages (homepage, about, process, contact) and provides a brief summary of the company for AI assistants.
Robots.txt does not name any AI bots Low
The robots.txt is a bare Yoast catch-all with no disallow rules and no explicit rules for GPTBot, ClaudeBot, or other AI crawlers. While no blocking occurs, the lack of explicit guidance leaves AI crawlers without crawl budget optimization.
What to change: Add explicit crawl-delay and allow/disallow rules for major AI bots to manage crawl budget and signal which sections are important.
No portfolio brands, case studies, or financial proof on site High
The site describes an acquisition process but provides zero portfolio brand names, case studies, or financial figures. The LLM's cold knowledge fills in generic aggregator context, but the site itself offers no evidence of success.
What to change: Add a portfolio page with brand names, case studies, and key metrics (funding, revenue growth) to provide concrete proof points for AI crawlers and human visitors.
Homepage is thin at 259 words with no FAQ or structured data Medium
The homepage contains only 259 words, lacks FAQ content, and has no structured data beyond basic WebPage/Organization schema. This limits the site's ability to appear in AI-generated answer snippets.
What to change: Expand the homepage with an FAQ section addressing common seller questions, and add FAQPage schema to increase chances of appearing in AI answer boxes.
Process page lacks HowTo schema despite clear step structure Medium
The /the-process/ page presents a clear 6-step numbered process but uses only plain HTML. Adding HowTo schema would help AI engines extract and present the steps in search results.
What to change: Add HowTo schema markup to the 6-step process page, including step-by-step instructions and estimated duration.
No Article or NewsArticle schema on any page Low
The site uses Yoast schema but does not include Article or NewsArticle types, even on pages that could be considered articles (e.g., press pages, if they existed). This reduces the chance of appearing in Google's Top Stories or AI news summaries.
What to change: Add Article or NewsArticle schema to any content pages that are time-sensitive or news-related.
Zero external search results for the domain or brand High
Multiple web searches for 'olsamgroup.com', 'Olsam Group aggregator', and related queries returned zero results. The site has minimal external visibility, which limits AI crawler discovery and authority signals.
What to change: Build a backlink strategy through PR, guest posting, and partnerships to increase domain authority and AI crawler discovery.
All press pages referenced on homepage return 404 High
The homepage displays 'As Seen In' logos for Forbes, TechCrunch, Business Insider, and Sifted, but every corresponding press page URL returns 404. This undermines credibility and wastes crawl budget.
What to change: Either restore the press pages with actual content or remove the broken links from the homepage and sitemap.
No FAQ or Q&A content anywhere on the site Medium
The site has no FAQ page or Q&A sections that could be marked up with FAQPage schema. This misses opportunities to appear in AI-generated answer boxes for common seller questions.
What to change: Create an FAQ page addressing common questions from sellers (e.g., valuation, process, timeline) and mark it up with FAQPage schema.
Homepage cache TTL of 604800 seconds may serve stale content to crawlers Low
The homepage has a Cache-Control header with max-age=604800 (1 week), which could cause AI crawlers to receive stale content if the page is updated infrequently. However, given the static nature, this is a minor concern.
What to change: Reduce cache TTL to 86400 (1 day) or implement cache invalidation on content updates to ensure crawlers see fresh content.
No Product or Review schema despite being an e-commerce aggregator Low
The site does not use Product or Review schema, which would be relevant for showcasing portfolio brands or seller testimonials. This limits rich snippet potential.
What to change: Add Product schema for any portfolio brands and Review schema for seller testimonials if available.
What's working
- All major AI bots receive full HTML content with no blocking — GPTBot, ClaudeBot, PerplexityBot, and other AI crawlers all receive HTTP 200 with the same full HTML as a browser. No UA-based blocking or cloaking exists.
- Server-rendered HTML contains all visible text — The homepage and other pages are server-rendered with all visible text in the HTML, so AI crawlers can parse content without executing JavaScript.
- Yoast provides basic WebPage, WebSite, Organization, and BreadcrumbList schema — The site includes standard Yoast JSON-LD schema for WebPage, WebSite, Organization, and BreadcrumbList, providing a baseline for search engines.
- Process page has a clear 6-step structure that is easy for crawlers to parse — The /the-process/ page presents a well-structured 6-step numbered list that AI crawlers can easily extract, even without schema markup.
- Contact form collects revenue and sales channel data for lead qualification — The contact form asks for revenue brackets and sales-channel percentages, which helps qualify leads but does not directly aid AI visibility.
- Site uses Cloudflare for security and performance — Cloudflare provides DDoS protection and CDN caching, which improves site reliability and load times for both humans and crawlers.
- Sitemap is present and well-structured (though contains dead URLs) — The sitemap_index.xml exists and references two sub-sitemaps, which is a good practice for crawl efficiency, despite the dead URLs.
- Chinese-language mirror pages exist for key pages — The site has /cn/ versions of the homepage, about, process, and contact pages, which can help reach Chinese-speaking audiences and AI crawlers.
Track olsamgroup.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.