AI Site Grade
newstarthighland.org — AI Site Grade
New Start Highland's robots.txt uses a non-standard format that no major AI crawler understands, and the site lacks all structured data, leaving AI models with stale, incomplete knowledge.
New Start Highland has permissive crawler access but zero schema markup, a non-standard robots.txt, and a cold-knowledge gap that omits key programs like Highland Foodbank.
- Findings
- 10
- Evidence checks
- 20
- Completed
- 30 May 2026
Analysis
New Start Highland: AI-Visibility Audit
The site's /robots.txt uses a non-standard "content signal" format that no major AI crawler natively understands, effectively making it invisible to automated AI-bot policy enforcement — yet every AI crawler tested receives a full 200 response with identical content, creating a permissive-but-unstructured access posture.
Crawler Access
All eleven AI bot user-agents tested (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, Bytespider, Applebot-Extended, anthropic-ai, Perplexity-User, and a browser baseline) receive identical 200 responses from Cloudflare with the same 258,932-byte payload. No UA-based blocking, no JS shell, no thin-content redirect. The robots.txt contains zero standard User-agent / Disallow directives — instead it publishes a custom "content signal" framework (search/ai-input/ai-train flags) that no major crawler parses. llms.txt returns a 404 (Craft CMS 404 page, 24KB). The sitemap.xml is present, valid, and contains 64 URLs including blog posts, service pages, and shop pages.
Schema Posture
Every page examined — homepage, about-us, our-impact, shops, FAQs, blog posts, Highland Foodbank — contains zero JSON-LD schema of any type. No Organization, LocalBusiness, Charity, FAQPage, Article, BreadcrumbList, or WebSite schema is present. The homepage has no canonical tag. The site runs on Craft CMS behind Cloudflare with no visible security headers (no CSP, no HSTS).
Cold-Knowledge Gap
The LLM's prior knowledge about New Start Highland is stale and partially inaccurate. It describes a "Furniture for All" scheme that does not appear on the site, mentions a "community cafe" (The Yard Cafe exists but is not prominently described), and references a "2023 funding shortfall and public appeal" — no trace of this exists on the current site, which instead reports a record year in 2023 with 20% retail turnover growth and 80 staff. The model knows the charity is Inverness-based and focused on poverty/homelessness, but misses the Highland Foodbank operation (5,000 parcels/year), the Unique Ness upcycled interiors brand, the bike refurbishment program, and the Net Zero commitment. The site's own impact metrics (164,526 service uses, 9,249 items diverted from landfill) are absent from the model's knowledge.
Content & Structure
The homepage delivers 830 words of substantive text with a clear H1 ("Transforming Highland lives"), impact statistics, and testimonial quotes. The FAQs page uses genuine Q&A format but lacks FAQPage schema. Blog posts are dated from December 2023 through May 2026, indicating active publishing. The Highland Foodbank page is the richest content page (1,061 words) with a clear three-step referral process, centre locations, and a current "most needed items" list for May-June 2026. However, every page shares the identical meta description ("Helping people facing crisis, with furniture, training and housing support..."), which is a missed opportunity for differentiated search snippets.
External Signals
External search returns zero indexed results for the charity's name — a striking absence that suggests low off-domain citation density. The site links to Facebook, Instagram, LinkedIn, JustGiving, and a Northward studio credit, but no press coverage, review sites, or third-party articles surfaced in search. The blog references a "Big Issue Top 100 Changemaker" award for the charity's leader, but no external corroboration appeared in search results.
Findings
Robots.txt uses non-standard content signal format ignored by AI crawlers High
The robots.txt contains zero standard User-agent/Disallow directives, instead publishing a custom 'content signal' framework that no major AI crawler parses. This makes automated bot policy enforcement effectively invisible.
What to change: Replace the custom content signal format with standard User-agent and Disallow directives for each AI crawler, or adopt the accepted robots.txt extension for AI training opt-out.
llms.txt returns 404, missing AI content guidance Medium
The site lacks an llms.txt file, which is a recommended standard for providing AI crawlers with a curated list of content pages. The 404 response means no guidance is given to AI bots about which pages to prioritize.
What to change: Create an llms.txt file listing key pages such as about-us, how-we-help, and impact pages to guide AI crawlers.
No JSON-LD schema on any page High
Every page examined — homepage, about-us, our-impact, shops, FAQs, blog posts, Highland Foodbank — contains zero JSON-LD schema of any type. Missing Organization, LocalBusiness, Charity, FAQPage, Article, BreadcrumbList, and WebSite schema.
What to change: Add JSON-LD structured data: Organization schema on the homepage, Charity schema on about-us, FAQPage schema on /faqs, Article schema on blog posts, and LocalBusiness schema on shop pages.
Homepage lacks canonical tag Low
The homepage has no canonical tag, which can lead to duplicate content issues if the site is accessible via multiple URLs (e.g., with or without www).
What to change: Add a self-referencing canonical tag to the homepage.
LLM knowledge is stale and omits key programs High
The LLM's prior knowledge about New Start Highland is outdated and incomplete. It mentions a 'Furniture for All' scheme and a 2023 funding shortfall that do not appear on the current site, while missing the Highland Foodbank, Unique Ness upcycled interiors brand, bike refurbishment program, and Net Zero commitment.
What to change: Publish a dedicated 'About' or 'Our Work' page with comprehensive, up-to-date information and ensure it is well-linked and included in the sitemap. Consider adding an llms.txt file to guide AI crawlers to this content.
All pages share the same meta description Medium
Every page on the site uses the identical meta description: 'Helping people facing crisis, with furniture, training and housing support...'. This is a missed opportunity for differentiated search snippets and reduces click-through rates.
What to change: Write unique meta descriptions for each page that summarize the specific content of that page.
Near-zero external search results for the charity name High
Web searches for 'New Start Highland' return zero indexed results, indicating very low off-domain citation density. No press coverage, review sites, or third-party articles surfaced.
What to change: Encourage local press coverage, list the charity on directories like Charity Commission for Scotland, and seek backlinks from partner organizations.
FAQs page lacks FAQPage schema Medium
The /faqs page uses genuine Q&A format but does not include FAQPage structured data, which would enable rich results in search and AI-generated answers.
What to change: Add FAQPage JSON-LD schema to the /faqs page with each question and answer.
No CSP or HSTS security headers Low
The site lacks Content Security Policy and HTTP Strict Transport Security headers, which are not directly related to AI visibility but indicate a lack of security best practices that could affect trust signals.
What to change: Implement CSP and HSTS headers to improve security posture.
Blog posts lack Article schema Medium
Blog posts, such as the record year announcement, do not include Article or NewsArticle structured data, which would help search engines and AI understand the content type and publication details.
What to change: Add Article schema to blog posts with headline, datePublished, author, and image.
What's working
- All AI crawlers receive full 200 responses with identical content — Every AI bot tested (GPTBot, ClaudeBot, PerplexityBot, etc.) receives a 200 response with the same full HTML payload. No blocking, no JS shell, no thin-content redirect.
- Sitemap.xml is present and contains 64 URLs — The sitemap is valid and includes blog posts, service pages, and shop pages, helping crawlers discover content.
- Homepage delivers 830 words of substantive text with clear H1 — The homepage has a clear H1 ('Transforming Highland lives'), impact statistics, and testimonial quotes, providing rich content for AI models.
- Blog is actively published with posts through May 2026 — Blog posts are dated from December 2023 to May 2026, indicating ongoing content creation that can attract AI attention.
- Highland Foodbank page is content-rich with 1,061 words — The foodbank page provides a clear three-step referral process, centre locations, and a current 'most needed items' list, offering valuable structured information.
- Site links to Facebook, Instagram, LinkedIn, and JustGiving — External social media and donation platform links are present, providing off-site signals and potential backlink sources.
- Impact page publishes specific metrics (164,526 service uses, 9,249 items diverted) — The 'Our Impact' page includes concrete numbers that can be used by AI to generate accurate summaries.
- Site uses Cloudflare CDN for performance and security — Cloudflare provides caching and DDoS protection, improving site reliability and load times for crawlers.
Track newstarthighland.org across AI search
This is one snapshot. Open the interactive report to inspect evidence, or grade another site free.