AI Site Grade

firstcommand.com — AI Site Grade

First Command's sitemap and llms.txt return 404 HTML pages, breaking AI crawler discovery of all site content.

First Command's site has a broken sitemap and llms.txt, no AI crawler controls, and a JS-dependent coaching center that hides rich content from AI bots.

Findings: 10
Evidence checks: 28
Completed: 30 May 2026

Analysis

First Command: A Next.js site with a broken sitemap, no AI crawler controls, and a coaching center that serves AI bots a JS shell

The site's /sitemap.xml and /llms.txt both return a 404 HTML page (Next.js app shell) with HTTP 200, meaning search engines and AI crawlers see zero discoverable URLs and zero AI-friendly content guidance — a complete structural failure of two of the most basic discovery mechanisms.

Crawler Access

All major AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, anthropic-ai, Bytespider, Applebot-Extended — receive a full 200 response identical to browser traffic, served from Microsoft-IIS/10.0 on Azure (CNAME wwwfirstcommand.azurewebsites.net). No UA-based blocking, no Cloudflare challenge, no WAF. However, the robots.txt (last updated March 2022) contains zero AI-bot-specific rules — only a catch-all User-agent: * blocking query-parameter URLs, /legacy/, and /component-library/. The /llms.txt endpoint returns a 404 page rendered as HTML (81KB of Next.js JS bundles) rather than plain text, making it useless for AI content guidance.

Sitemap and Content Discovery

The /sitemap.xml URL returns HTTP 200 but serves the Next.js 404 page (full HTML shell, canonical to /404/) instead of actual XML. This means Google, Bing, and every AI crawler that follows sitemap directives finds zero URLs. The list_known_urls tool could only discover 19 URLs from homepage links — the site's full page inventory is opaque to automated discovery. The Coaching Center (/coaching-center/) is a client-side rendered page that returns only 7 words of visible text ("News & Articles Categories All Load More") from a plain GET — AI crawlers that don't execute JavaScript see an empty shell. Individual article pages like the active-vs-passive-investing piece (4,760 words, proper NewsArticle schema) render fully server-side and are well-structured, but they are invisible to crawlers that cannot navigate the JS-dependent index page.

Cold-Knowledge Gap

The LLM's prior knowledge about First Command is substantially more detailed and more critical than what the site communicates. The model knows about the company's 1958 founding by a retired Air Force officer, its Fort Worth headquarters, its focus on military families, and its product suite (TSP, VA loans, BAH budgeting). But it also recalls historical regulatory controversy — a 2004 SEC fine for misleading variable annuity sales to military families, and ongoing consumer-advocate scrutiny over high-commission whole life insurance products pushed to junior enlisted personnel. The site itself makes zero mention of any regulatory history, criticism, or controversy. The "Reviews & Ratings" page lists awards (Military Friendly, BBB A+, WalletHub) but contains no third-party client reviews, no Trustpilot or ConsumerAffairs integration, and no acknowledgment of the company's mixed reputation.

Schema Posture

The homepage carries a minimal Organization schema with founding date, description, and social profiles. The FAQ page has a well-formed FAQPage schema with 13 questions and answers. Individual blog articles use NewsArticle schema with datePublished and author. However, the banking page has a broken FAQPage schema with an empty acceptedAnswer object. No Product, Service, LocalBusiness, or FinancialService schema exists anywhere — despite the site offering banking, insurance, investing, and financial planning services. No BreadcrumbList schema is present on any page. The Organization schema on most pages is a bare stub with only a url property.

External Signals

External search results for First Command reviews are remarkably sparse — DuckDuckGo returned zero results for queries combining "First Command" with "reviews," "military," and "Reddit." The site links to Glassdoor for employee reviews but does not surface any independent client-review platforms. The company's external presence is dominated by its own award badges (Military Friendly, BBB, VETS Indexes) rather than organic third-party discourse.

Findings

Sitemap returns 404 HTML page instead of XML High
The /sitemap.xml URL returns HTTP 200 but serves the Next.js 404 page (HTML shell) instead of valid XML, making zero URLs discoverable by search engines and AI crawlers.
What to change: Replace the sitemap endpoint with a dynamically generated XML sitemap listing all public URLs, and ensure it returns Content-Type: application/xml.
llms.txt returns 404 HTML page instead of plain text High
The /llms.txt endpoint returns a 404 page rendered as HTML (81KB of Next.js JS bundles) rather than plain text, providing no AI content guidance.
What to change: Serve a plain-text llms.txt file at /llms.txt with a list of AI-relevant resources and content summaries.
Robots.txt has no AI-bot-specific rules Medium
The robots.txt (last updated March 2022) contains only a catch-all User-agent: * blocking query-parameter URLs, /legacy/, and /component-library/. No AI crawlers (GPTBot, ClaudeBot, etc.) are named or given any directives.
What to change: Add explicit rules for AI crawlers (e.g., GPTBot, ClaudeBot) to allow or disallow crawling of specific sections, and consider adding a crawl-delay directive.
Coaching Center index page is a client-side JS shell High
The /coaching-center/ page returns only 7 words of visible text from a plain GET; AI crawlers that do not execute JavaScript see an empty shell, hiding the rich article index.
What to change: Server-side render the coaching center index page with full article links and metadata so that non-JS crawlers can discover all articles.
Only 19 URLs discoverable from homepage links Medium
Automated discovery via homepage links found only 19 URLs, indicating poor internal linking and reliance on client-side navigation, making the full page inventory opaque to crawlers.
What to change: Add a comprehensive sitemap and improve internal linking with static HTML links to all important pages.
No FinancialService or Product schema on service pages Medium
Despite offering banking, insurance, investing, and financial planning, no Product, Service, LocalBusiness, or FinancialService schema exists on any page. The Organization schema is a bare stub on most pages.
What to change: Add FinancialService, Product, and Service schemas to relevant pages, and enrich the Organization schema with full details.
Banking page has broken FAQPage schema with empty acceptedAnswer Medium
The banking page includes a FAQPage schema where one question has an empty acceptedAnswer object, which may cause validation errors or be ignored by search engines.
What to change: Remove the empty FAQ entry or provide a proper answer text.
No BreadcrumbList schema on any page Low
No page on the site includes BreadcrumbList structured data, which helps search engines understand site hierarchy and can enhance search result snippets.
What to change: Add BreadcrumbList schema to all pages with a clear path to the homepage.
External review presence is very sparse Medium
Web searches for First Command reviews on Reddit and other platforms returned zero results, indicating low organic third-party discourse and potential trust signals gap.
What to change: Encourage satisfied clients to leave reviews on independent platforms (Trustpilot, ConsumerAffairs) and link to those profiles from the site.
Site omits any mention of regulatory history or criticism Low
The site's 'Reviews & Ratings' page lists awards but contains no third-party client reviews, no Trustpilot integration, and no acknowledgment of the company's mixed reputation or past SEC fine.
What to change: Consider adding a balanced section that addresses past controversies and current improvements to build trust.

What's working

All major AI crawlers receive full 200 responses — GPTBot, ClaudeBot, PerplexityBot, and others are not blocked by robots.txt or server-side restrictions, allowing them to crawl the site freely.
Individual article pages render server-side with NewsArticle schema — Articles like the active-vs-passive-investing piece (4,760 words) are fully server-side rendered and include proper NewsArticle schema with datePublished and author.
FAQ page has well-formed FAQPage schema with 13 questions — The /frequently-asked-questions/ page includes a valid FAQPage schema with 13 questions and answers, which can generate rich results in search.
Homepage has Organization schema with founding date and social profiles — The homepage includes a minimal Organization schema that provides founding date, description, and social profile links, aiding entity recognition.
Reviews page lists credible third-party awards — The /our-reputation/reviews-and-ratings/ page displays awards from Military Friendly, BBB A+, and WalletHub, providing external validation signals.

Track firstcommand.com across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand

Analysis

First Command: A Next.js site with a broken sitemap, no AI crawler controls, and a coaching center that serves AI bots a JS shell

Crawler Access

Sitemap and Content Discovery

Cold-Knowledge Gap

Schema Posture

External Signals

Findings

Sitemap returns 404 HTML page instead of XML High

llms.txt returns 404 HTML page instead of plain text High

Robots.txt has no AI-bot-specific rules Medium

Coaching Center index page is a client-side JS shell High

Only 19 URLs discoverable from homepage links Medium

No FinancialService or Product schema on service pages Medium

Banking page has broken FAQPage schema with empty acceptedAnswer Medium

No BreadcrumbList schema on any page Low

External review presence is very sparse Medium

Site omits any mention of regulatory history or criticism Low

What's working

Track firstcommand.com across AI search