AI Site Grade

southeastwater.com.au — AI Site Grade

South East Water's news archive is entirely broken — every article URL returns a 404, yet the listing page and sitemap reference these dead paths, blocking AI crawlers from accessing any news content.

South East Water's site has a completely broken news archive (all article URLs 404), zero structured data, and no AI-specific crawler controls, severely limiting AI visibility.

Findings
11
Evidence checks
36
Completed
30 May 2026

Analysis

South East Water's news archive is entirely broken — every individual article URL returns a 404, yet the news listing page and sitemap both reference these dead paths, meaning AI crawlers that follow links from the listing page hit a wall of Apache Sling errors.

Crawler Access

All major AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, Bytespider, Applebot-Extended, anthropic-ai — receive a full 200 response with identical byte payload (432,745 bytes) to a browser baseline. No UA-based blocking exists. The robots.txt uses a single User-agent: * rule allowing / and disallowing only /search, /sew/, /archive/, /support-pages/, and error paths. No AI-specific directives are present. The site runs on Apache with Adobe Experience Manager (AEM) / Apache Sling and an Apache Dispatcher cache layer, with no CDN or WAF (no Cloudflare, no Akamai). No llms.txt exists (404). The sitemap lists 544 URLs but contains zero news article URLs — only top-level section pages and AEM short-path redirectors.

Content & Schema Posture

Every page examined — homepage, about, strategy, careers, faults, water meters, recycled water — contains zero JSON-LD schema of any type. No Organization, WebSite, NewsArticle, FAQPage, or BreadcrumbList markup exists. The homepage has a single H1 ("Healthy Water. For Life.") and uses H3 for section headings with no H2 hierarchy. The recycled water page is the strongest content piece at ~700 words with a clear list of use-cases and cost comparison (38% cheaper than drinking water), but it lacks FAQ schema despite having question-like phrasing. The digital meters page has an FAQ section but no FAQPage schema. The news listing page shows 40+ article headlines and dates but every single article detail URL returns a 404 — the AEM content paths are missing or unpublished.

Cold-Knowledge Gap

The LLM knows South East Water as a state-owned utility serving ~1.8 million people in Melbourne's southeast, with Class A recycled water schemes and the world's largest sewer-mining facility. It also recalls regulatory scrutiny over sewage spills into Port Phillip Bay and the Mornington Peninsula, and fines — a reputational signal entirely absent from the site's news feed, which instead leads with feel-good stories about water stations, operator awards, and Aboriginal artwork. The site's news section contains a "Joint statement on the McCrae mediation" and "response to the McCrae landslide Inquiry report" — but these are buried among 40+ entries and the article pages themselves are 404s, so an AI crawler cannot read them.

External Signals

External search results returned zero indexed results for queries about South East Water's sewage spills, fines, or digital meter milestones — suggesting the site's news content is either not indexed by search engines or the 404s prevent indexing. The DNS reveals a successfactors-site-verification TXT record (SAP SuccessFactors HR platform) and Proofpoint email handling. The site has active social media presences (Facebook, Instagram, LinkedIn, YouTube, Twitter/X) but no structured data linking them.

Surprising Findings

The entire news article archive is broken — every article URL pattern (/about-us/news/[slug]/) returns a 404 with an Apache Sling resource-not-found error. The news listing page renders article titles and dates that link to these dead URLs. The sitemap excludes news articles entirely, so crawlers cannot discover them via sitemap either. The site uses a dual URL structure — clean paths like /accounts-and-billing/water-meters/digital-meters/ alongside AEM short paths like /content/sew/au/en/digimeters that redirect — creating a fragmented URL ecosystem. The homepage canonical points to /home/ (a trailing slash variant) rather than the bare domain.

Findings

  1. Entire news article archive returns 404 errors High

    Every individual news article URL on the site returns a 404 Apache Sling error. The news listing page and sitemap reference these dead paths, so AI crawlers following links from the listing page hit a wall of errors.

    What to change: Publish or restore the missing news article pages at their canonical URLs, or remove the broken links from the news listing page and sitemap.

  2. Zero JSON-LD structured data on any page High

    No page on the site contains JSON-LD schema markup. Missing Organization, WebSite, NewsArticle, FAQPage, BreadcrumbList, and other common schema types that help AI crawlers understand and cite content.

    What to change: Add JSON-LD structured data for Organization, WebSite, NewsArticle (on news pages), FAQPage (on FAQ sections), and BreadcrumbList across the site.

  3. No llms.txt file for AI crawler guidance Medium

    The site does not provide an llms.txt file, missing an opportunity to guide AI crawlers to key content and provide structured context.

    What to change: Create an llms.txt file at the root listing key pages and providing a brief site summary for AI crawlers.

  4. No AI-specific robots.txt directives Medium

    The robots.txt uses a single User-agent: * rule and does not name any AI crawlers (GPTBot, ClaudeBot, etc.), missing the chance to manage crawl behavior for AI bots.

    What to change: Add specific directives for AI crawlers in robots.txt, such as allowing or disallowing certain paths, and consider using crawl-delay.

  5. News articles excluded from sitemap High

    The sitemap lists 544 URLs but contains zero news article URLs, preventing crawlers from discovering news content via sitemap.

    What to change: Include all published news article URLs in the sitemap to ensure crawlers can discover them.

  6. Dual URL structure creates fragmentation Medium

    The site uses both clean paths (e.g., /accounts-and-billing/water-meters/digital-meters/) and AEM short paths (e.g., /content/sew/au/en/digimeters) that redirect, creating a fragmented URL ecosystem that can confuse crawlers and dilute link equity.

    What to change: Consolidate to a single canonical URL structure, preferably the clean paths, and ensure all redirects are 301s to the canonical version.

  7. Homepage canonical URL points to /home/ instead of bare domain Low

    The homepage canonical URL is set to https://southeastwater.com.au/home/ rather than the bare domain, which can cause indexing confusion.

    What to change: Set the homepage canonical to https://southeastwater.com.au (without trailing path).

  8. Missing H2 heading hierarchy on homepage Low

    The homepage uses H1 and H3 headings but no H2 headings, creating a poor heading structure that reduces content clarity for crawlers and accessibility tools.

    What to change: Add H2 headings to section titles on the homepage to create a proper heading hierarchy.

  9. FAQ sections lack FAQPage schema Medium

    Pages like the digital meters page contain FAQ-style content but lack FAQPage structured data, reducing the chance of appearing as rich results in AI-generated answers.

    What to change: Add FAQPage JSON-LD schema to pages with question-and-answer content.

  10. Reputational content buried in broken news archive High

    The site's news section contains articles about regulatory scrutiny (McCrae landslide inquiry, sewage spills) but these are buried among 40+ entries and the article pages themselves are 404s, so AI crawlers cannot access them to provide balanced context.

    What to change: Fix the broken news articles and consider featuring important regulatory responses more prominently on the site.

  11. No external search results found for key queries Medium

    External searches for South East Water's sewage spills, fines, and digital meter milestones returned zero indexed results, suggesting the site's news content is not indexed by search engines or the 404s prevent indexing.

    What to change: Ensure news articles are properly published and indexed, and consider building backlinks to key content.

What's working

  • All major AI crawlers allowed access — All tested AI crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) receive a full 200 response with identical content to a browser, with no UA-based blocking.
  • Recycled water page provides detailed, useful content — The recycled water page is ~700 words with clear use-cases and cost comparison (38% cheaper than drinking water), offering substantive content that AI crawlers can use to answer queries.
  • Digital meters page includes FAQ-style content — The digital meters page contains a FAQ section with questions and answers, providing structured information that could be enhanced with schema.
  • Sitemap exists with 544 URLs — The site has a sitemap listing 544 URLs, helping crawlers discover top-level pages.
  • News listing page provides article headlines and dates — The news listing page at /about-us/news/ contains 40+ article headlines and dates, giving crawlers a summary of recent news despite the broken detail pages.
  • Active social media profiles on multiple platforms — The site links to active Facebook, Instagram, LinkedIn, YouTube, and Twitter/X profiles, providing external signals and engagement channels.

Track southeastwater.com.au across AI search

This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.

Open this AI Site Grade Grade another site Track your brand