AI Site Grade
lingoace.com — AI Site Grade
LingoAce's core program pages render zero visible text to AI crawlers, undermining an otherwise mature llms.txt and schema strategy.
LingoAce has strong llms.txt and JSON-LD schema, but three of four core program pages are empty JavaScript shells to AI crawlers, and the site lacks external review signals.
- Findings
- 10
- Evidence checks
- 32
- Completed
- 30 May 2026
Analysis
LingoAce AI-Visibility Audit
The site has a mature llms.txt and rich JSON-LD schema across every page, yet three of its four core program pages (Chinese, English, Math) return zero words of visible text to any crawler — a full JavaScript shell that renders content only after client-side execution.
Crawler Access
All 11 AI bot UAs tested (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, Bytespider, Applebot-Extended, anthropic-ai, Perplexity-User) receive a 200 status with identical byte payload (~99KB homepage, ~76KB program pages) behind Cloudflare. No UA-based blocking exists. However, the identical byte size across bots and browser is deceptive: the homepage yields only 28 words of extracted text, and /programs/learn-chinese/, /programs/learn-english/, and /programs/online-math-classes-for-kids/ yield 0 words. The site is a single-page application (likely Next.js or similar) where core product content is rendered client-side. AI crawlers that do not execute JavaScript see empty shells.
llms.txt and robots.txt
The /llms.txt file is a standout — present, well-structured, listing 30+ core pages with descriptions, metadata, and versioning (last updated 2026-03-05). This is an advanced signal few competitors deploy. The /robots.txt allows all AI bots (no Disallow for any AI user-agent) and blocks only /api/, /404/, and preview paths. The sitemap index covers five locale-specific sitemaps (US, ZH, SG, TH, ID), confirming a sophisticated multi-region strategy.
Cold-Knowledge Gap
The LLM's prior knowledge describes LingoAce as "primarily teaching Mandarin Chinese to children aged 4-15" with "over $100 million in funding" and "300,000+ students across 100+ countries." The site itself claims 22M+ lessons taught, 7,000+ teachers, 180+ countries, and positions itself as a 2025 Forbes China Influential Brand and GSV 150 honoree. The cold model knows nothing about the Math or English programs, the Ace Academy offline campuses (San Jose, Scarsdale, Great Neck, Somerset), or the Forbes/GSV accolades. The model also recalls "parent complaints on Trustpilot about inconsistent teacher quality" — a reputational signal the site does not address.
Schema Posture
Every page carries a rich EducationalOrganization + WebSite + WebPage graph with sameAs links to 9 platforms, three office addresses, and an aggregateRating (4.5/5 from 1,387 reviews). The Chinese program page adds a Course schema with hasCourseInstance stages and a FAQPage. The blog uses BlogPosting with datePublished/dateModified. The teachers page uses Person schema with knowsLanguage. This is among the most complete schema deployments seen on an edtech site — but the FAQPage JSON-LD on the FAQ page is truncated in extraction, suggesting the payload may exceed crawler size budgets.
External Signals
The site links to Trustpilot, but Trustpilot blocks automated fetches (403). DuckDuckGo searches for reviews, Reddit threads, and press coverage returned zero results — a striking absence suggesting either low organic off-domain footprint or search engine limitations. The DNS shows Cloudflare (A/AAAA), Microsoft 365 for mail, and Amazon SES for transactional email — a standard enterprise stack with no AI-specific infrastructure signals.
Findings
Core program pages render as empty JavaScript shells to crawlers High
The /programs/learn-chinese/, /programs/learn-english/, and /programs/online-math-classes-for-kids/ pages return 200 status but yield zero words of visible text when fetched without JavaScript execution. AI crawlers that do not render JavaScript see no content.
What to change: Implement server-side rendering (SSR) or static generation for these pages so that core content is included in the initial HTML payload. Alternatively, use dynamic rendering to serve pre-rendered content to AI crawlers.
Homepage yields only 28 words of extractable text High
The homepage returns a 99KB payload but only 28 words of visible text are extracted, indicating heavy reliance on client-side rendering for key messaging.
What to change: Ensure the homepage includes meaningful text content in the initial HTML, such as taglines, value propositions, and key statistics, to improve AI crawler understanding.
No external review signals found in search results Medium
DuckDuckGo searches for LingoAce reviews, Reddit discussions, and press coverage returned zero results. Trustpilot page returns a 403 error, blocking automated access. This absence limits off-domain signals for AI models.
What to change: Encourage and amplify customer reviews on platforms like Trustpilot, G2, and Reddit. Ensure the Trustpilot page is accessible to crawlers. Publish press releases and case studies to build off-domain footprint.
LLM prior knowledge missing Math, English programs and recent accolades Medium
The LLM's prior knowledge lacks awareness of LingoAce's Math and English programs, offline Ace Academy campuses, and recent accolades like Forbes China Influential Brand and GSV 150. This limits AI-generated descriptions of the full offering.
What to change: Ensure all program pages contain server-rendered text describing the Math and English curricula. Publish press releases and update Wikipedia or Crunchbase entries to include recent accolades and offline campuses.
FAQPage JSON-LD may be truncated due to payload size Medium
The FAQ page's JSON-LD schema extraction was incomplete, suggesting the structured data payload may exceed crawler size budgets, potentially causing loss of FAQ content in AI consumption.
What to change: Review the FAQPage JSON-LD and consider splitting it into multiple smaller blocks or using a linked data approach to avoid truncation.
Teachers page yields only 15 words of extractable text Medium
The /teachers/ page returns 200 status but only 15 words of visible text, indicating client-side rendering for teacher profiles.
What to change: Implement server-side rendering for the teachers page to include teacher names, bios, and qualifications in the initial HTML.
Learning experience page yields only 36 words of extractable text Medium
The /learning-experience/ page returns 200 status but only 36 words of visible text, suggesting client-side rendering for key content.
What to change: Ensure the learning experience page includes descriptive text in the initial HTML payload.
Blog page yields only 63 words of extractable text Low
The /blog/ page returns 200 status but only 63 words of visible text, indicating client-side rendering for blog listings.
What to change: Implement server-side rendering for the blog listing page to include post titles and excerpts.
Pricing page yields only 120 words of extractable text Low
The /pricing/ page returns 200 status but only 120 words of visible text, suggesting client-side rendering for pricing details.
What to change: Ensure pricing tiers and plan details are included in the initial HTML.
Differences page yields only 234 words of extractable text Low
The /differences/ page returns 200 status but only 234 words of visible text, which is relatively low for a content page.
What to change: Consider adding more textual content to the differences page to better convey the unique teaching methodology.
What's working
- llms.txt file is present, well-structured, and lists 30+ core pages — The /llms.txt file is a mature AI visibility signal, listing 30+ core pages with descriptions and metadata, last updated March 2026. This is an advanced practice few competitors deploy.
- Every page carries rich EducationalOrganization and WebSite JSON-LD schema — Pages include a comprehensive schema graph with sameAs links, office addresses, aggregateRating (4.5/5 from 1,387 reviews), and program-specific Course schema. This is among the most complete schema deployments seen on an edtech site.
- robots.txt allows all AI bots and blocks only non-content paths — The robots.txt file has no disallow rules for any AI user-agent, ensuring full crawler access to content pages. Only /api/, /404/, and preview paths are blocked.
- Sitemap index covers five locale-specific sitemaps — The sitemap index includes sitemaps for US, ZH, SG, TH, and ID locales, indicating a sophisticated multi-region content strategy that helps AI crawlers discover localized content.
- Blog posts use BlogPosting schema with dates — The blog page and individual posts include BlogPosting schema with datePublished and dateModified, helping AI models understand content freshness.
- Teachers page uses Person schema with knowsLanguage — The /teachers/ page includes Person schema with knowsLanguage property, helping AI models understand teacher qualifications and language expertise.
- No user-agent based blocking for any AI bot — All 11 AI bot user-agents tested receive 200 status with identical byte payloads, confirming no UA-based blocking exists.
- About Us page contains 443 words of extractable text — The /about-us/ page provides substantial text content (443 words) that is visible to crawlers, including company background and accolades.
Track lingoace.com across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.