Haide · free tool
GEO Checklist
67 checks for getting cited in ChatGPT, Claude, Perplexity, and Google AI Overviews. Five tiers in order of leverage. Pick your industry to reweight every check by what actually moves the needle in your vertical — drawn from the OppAlerts study of 144 industries. Each check has a diagnostic step and a fix, so a non-SEO can hand items to the right person on the team without translation. Free, no email, progress saves in your browser.
How AI reads your page
Tier 1 · 0/11 done · 0%
The writing rules behind AI citations. What gets quoted, what gets skipped, and where on the page the quotable lines need to live. Drawn from an analysis of 1.2 million ChatGPT answers in 2026.
Third-party trust signals
Tier 2 · 0/16 done · 0%
Outside voices count more than your own. Reviews, comparisons, YouTube, Reddit, the Knowledge Graph. Without these, an AI model has no record that you exist beyond your own marketing pages.
Crawlability and machine-readability
Tier 3 · 0/16 done · 0%
Whether AI crawlers can reach the page and parse what they find. Cloudflare blocks the major bots by default since July 2025. Schema declares who you are. Speed and rendering decide whether the page is even read.
Measurement and tracking
Tier 4 · 0/12 done · 0%
Google Analytics hides AI traffic. Search Console has a hidden regex filter that surfaces conversational queries. Server logs see what analytics never will. Build a baseline, watch the trend, make decisions from real numbers.
Industry calibration and evidence interpretation
Tier 5 · 0/12 done · 0%
The dominant signal is different in every industry. Wikidata correlates +0.87 with citations in homeowners insurance and −0.80 in senior home care. Reddit is positive in oil-change chains, negative in pest control. Before applying tiers I–IV uniformly, look up your vertical and reweight. Drawn from the OppAlerts LLM Ranking Factors study — 13 signals × 144 industries × ChatGPT 5.4.
What to skip
Patterns widely promoted as GEO best practice that don't hold up under measurement. Ignore them — and flag anyone selling them as a service.
Don't create an llms.txt file
No AI system currently uses it (April 2026)
Why
llms.txt is a proposed standard — similar to robots.txt in spirit — for telling AI crawlers what content to prioritise. Despite widespread advocacy, no major AI platform uses it as of April 2026. Google representatives have publicly said AI systems don't read it. SE Ranking's analysis of nearly 300,000 domains found no correlation between llms.txt presence and AI citation frequency; adoption sits at 10.13% while the effect measures to zero. Of the top 50 AI-cited domains, only one implements it. It's a time-sink that produces nothing, and it signals to technical readers that the team is chasing hype rather than measuring outcomes.
How to check
- Check yourdomain.com/llms.txt. If it exists — consider whether to leave it or delete it (it doesn't harm, but it also doesn't help).
- More importantly: if someone on your team is proposing to create one, reference this item.
- If an agency or consultant is selling llms.txt as a GEO service — this is a red flag about their methodology.
Source: SE Ranking 300K-domain study + public Google guidance + RESONEO research
Why this checklist exists
The opportunity review Haide runs before any GEO engagement.
Every quarter, three or four data sources move the picture of how AI search actually works: Kevin Indig's analysis of 1.2 million ChatGPT answers, RESONEO's reverse engineering of GPT-5 and Claude, Cloudflare's report on which crawlers reach which sites, and our own server-log work with clients across SaaS and eCommerce. Each source produces a few specific, actionable rules. None of them, on their own, are a checklist.
We assembled the 67 checks because that's what we needed internally — a single document that survives the next platform shift and tells a non-specialist exactly what to do, why, and in what order. Tier 5 (the 10-minute industry calibration) sets weighting, Tier 3 fixes the technical foundation, Tier 1 covers how AI reads the page, Tier 4 makes progress measurable, and Tier 2 carries the slowest but highest-leverage compounding work.
Made public for the same reason Signal Check is public: seeing the checklist applied to your own site is more useful than any pitch deck we could write. Use it on your domain. Use it on a competitor's. Print the PDF and pin it above the desk of whoever owns the website.
Sources & methodology
Every check traces back to a study, a patent, or a server log.
Each item in this checklist carries an evidence badge — ChatGPT-specific for ChatGPT-specific data, Google AI Overviews for Google AI Overviews / AI Mode work, Multi-LLM for findings tested across multiple systems, and Universal for foundational best practices that apply regardless of which model crawls the page. Below are the studies, datasets, and field-testing programmes the 67 checks are drawn from. Hover any badge on a check to see the title; the source link sits inside each item's expanded view too.
- Kevin Indig — Growth Memo 2026ChatGPT-specific
Statistical analysis of 1.2 million ChatGPT answers identifying the writing patterns, opening structures, and on-page features that get quoted versus skipped.
- Scope
- 1.2M ChatGPT answers, 2026
- Informs
- Tier I openers, headings, lists, citation anchors (T1.1–T1.9, T2.1, T2.2)
- OppAlerts — LLM Ranking FactorsChatGPT-specific
Spearman correlation between 13 web signals (Wikidata, Reddit, Common Crawl, Backlink Authority, etc.) and ChatGPT 5.4 recommendation patterns across 144 industries. Source for the entire industry-calibration layer and the Wikidata/Reddit sign-check rules.
- Scope
- 13 signals × 144 industries × ChatGPT 5.4, March 2026
- Informs
- Tier V (T5.1–T5.12) and the per-industry picker that reweights every check
- RESONEO — AIO/AIM Inspector deep-diveGoogle AI Overviews
Reverse-engineering of Google AI Overviews and AI Mode citation behaviour, plus GPT-5.4 site: query analysis. Identifies the specific schema, heading, and freshness signals Google's AI surfaces respond to.
- Scope
- Google AI Overviews + AI Mode + GPT-5.4, Feb–March 2026
- Informs
- T1.10, T2.4 (Reddit Posts), T2.7 (Knowledge Graph), T2.13, T3.7, T3.10, T4.1
- AirOps × Indig — 2026 State of AI SearchChatGPT-specific
Cross-platform survey of AI-search citation patterns and traffic conversion rates. Used to calibrate measurement targets and the citation-source breakdown.
- Scope
- Multi-LLM survey, 2026 — but ChatGPT-weighted sample
- Informs
- T1.5 (citation source breakdown), T4.1 (AI traffic baseline)
- Cloudflare AI Audit (July 2025)Multi-LLM
Default-block policy for AI bots across the Cloudflare network, plus the published roster of 13 known AI crawlers (GPTBot, ClaudeBot, Google-Extended, Perplexity-User, Applebot-Extended, etc.).
- Scope
- 13-bot crawler roster, July 2025 onward
- Informs
- T3.1 (Cloudflare bot management), T3.2 (robots.txt), T3.11 (full bot roster)
Official docs for ClaudeBot, ClaudeBot-User, anthropic-ai, GPTBot, ChatGPT-User, Google-Extended, Perplexity-User, Applebot-Extended, and the Apple/Amazon/ByteDance/Meta bots. Source-of-truth for which bots to allow versus block per use case.
- Scope
- Multi-LLM crawler infrastructure, 2024–2026
- Informs
- T3.3 (allow ClaudeBot for grounded answers), T3.4 (per-bot rules)
- Google patent US12013887B2 — Information GainGoogle AI Overviews
Filed mechanism for ranking content based on contribution of new information beyond what's already in the index. Underpins why pages that repeat existing content get suppressed in AI Overviews.
- Scope
- Google ranking systems
- Informs
- T1.11 (Information Gain — write what's not already on the web)
The structured-data vocabulary and Google's rich-results documentation. Schema is recommended on this checklist only for classic Google rich results (FAQ, Article, BreadcrumbList, LocalBusiness) — Ahrefs' Aug 2025–Mar 2026 causal study (1,885 pages adding schema vs 4,000 controls) found no measurable effect on AI citations.
- Scope
- Classic Google rich results — NOT for AI visibility
- Informs
- T3.5 (FAQ schema), T3.6 (Article schema), T3.7 (LocalBusiness)
Causal study comparing 1,885 pages that added JSON-LD vs 4,000 matched controls. Result: ChatGPT +2.2% (null), Google AI Mode +2.4% (null), Google AI Overviews −4.6% (significant decline). Five major AI systems extracted only visible HTML during retrieval. Schema is for classic search rich results, not AI citations.
- Scope
- 5 AI systems, 5,885 pages, Aug 2025–Mar 2026
- Informs
- Why this checklist does NOT recommend schema or llms.txt for AI visibility
Server-log analyses, multi-persona LLM testing, GSC regex patterns, and topical-authority methodology developed across active client engagements (Acronis, MobiSystems, PhoneArena, Bondex, and others, 2025–2026).
- Scope
- Applied across active client work, multi-LLM
- Informs
- T2.11, T2.12, T2.14, T4.2, T4.3, T4.10, T4.11, T4.12 and the methodology threading the full checklist
Considered and rejected
- JSON-LD / schema for AI citations — Ahrefs causal study (5,885 pages, 5 AI systems, Aug 2025–Mar 2026) found no measurable lift, with a small significant decline on Google AI Overviews. Schema stays in this checklist only for classic Google rich results.
llms.txt— Not honored by any major AI crawler. SE Ranking 300K-domain study and public Google guidance both confirm zero measurable effect on indexing or citations. Listed as an explicit anti-pattern (A1).- AI-content detectors as quality signals — No causal evidence that LLM citation engines penalise AI-assisted writing per se. The signal that matters is whether the passage is quotable and grounded, not whether a human typed it.
Frequently asked questions
GEO stands for Generative Engine Optimisation — the practice of getting your content cited inside AI answers from ChatGPT, Claude, Perplexity, Gemini, Copilot, and Google's AI Overviews. The mechanics overlap with classical SEO (technical foundations, schema, content quality) but the targets are different: SEO ranks pages, GEO earns the quote inside an AI's synthesised answer.
Pick your industry from the picker (top right of the toolbar) and every check gets reweighted by what actually correlates with ChatGPT recommendations in your vertical. Items that track the industry's strongest signal get a Critical badge and float to the top of their tier. Items tracking a signal that's *negative* in your vertical (e.g. Wikidata in senior home-care, Reddit in pest control) get a Counter-productive badge and drop to the bottom — so you don't waste budget on tactics that hurt you. The data is from the OppAlerts LLM Ranking Factors study (Spearman correlation across 13 web signals × 144 industries × ChatGPT 5.4, March 2026). The reweight only changes priority, never hides the universal items — Tier I writing rules and Tier III technical hygiene apply to everyone.
OppAlerts covers 144 industries — the most-asked-about commercial verticals across SaaS, eCommerce, services, brands, insurance, healthcare, hospitality, and trades. If your exact niche isn't there, pick the closest analogue (e.g. Amazon-seller services has no direct match, but eCommerce SaaS or DTC brand verticals share most of the citation dynamics). When no industry is selected, the universal default applies and every check has equal weight — that's the baseline we built first, and it stays valid.
No — and you shouldn't try in one quarter. Tier 3 (Crawlability) is non-negotiable: if AI crawlers can't reach you or parse your HTML, nothing else matters. Start there, then Tier 1 (the writing rules), then Tier 4 (so you can measure progress), then Tier 2 (the slowest but most compounding tier). Run Tier 5 (Industry calibration) before any of the others to know which signals matter most for your vertical — the lookup takes 10 minutes and reweights everything downstream. The realistic order is Tier 5 lookup in week one, Tier 3 fixes in week one, Tier 1 across a quarter, Tier 4 by month two, Tier 2 over the year.
No account. Your checked items are stored in your browser's local storage on your device. Reload the page, close the tab, come back next week — it's still there. Switch to a different browser or device and it's gone, because it never leaves your machine. We never see your progress and don't track it.
It's a clean, branded PDF that captures your current progress — checkboxes filled in based on what you've completed. Useful for circulating inside a team, attaching to a quarterly review, printing and pinning above a desk, or handing to a contractor as a brief. The print view strips out the interactive UI and lays the content out for paper, with each tier starting on a new page.
Quarterly at minimum. During an active growth phase — building out AI visibility from zero, migrating a site, or recovering from a crawlability regression — monthly. Tier 3 items (Crawlability) need a monthly pass regardless, because robots.txt edits, Cloudflare setting changes, and new bot launches can break coverage without warning. Tier 2 items (Trust signals) compound over 6–12 months, so you're checking progress, not re-working them each quarter.
Some. Several Tier 3 and Tier 4 items are scriptable: FCP (T3.4) via the PageSpeed API, crawler access (T3.1, T3.2, T3.11) via curl + user-agent sweeps, server-log analysis (T4.2) via GoAccess or a Vector/Loki stack, GA4 referrer regex (T4.1, T4.11) via the GA4 API. Our own tool at haide.digital/tools/signal-check automates six of the signals here — crawler access, answer capsule, structured data, First Contentful Paint, technical integrity, and entity clarity. Tier 1 writing quality and Tier 2 trust signals require human judgement. Don't trust anyone who claims to fully automate either.
This page updates quarterly, so the bot list (T3.11), referrer regex (T4.11), and platform-specific items (T4.7, T2.16) stay current. The core principles don't change with new platforms — entity signals, crawlability, freshness, and extractable passages transfer directly. A new LLM from a new provider in Q3 2026 will almost certainly need the same five things the April 2026 roster needs: robots.txt access, a grounding source to search (Brave, Bing, or a proprietary index), Article/Organization schema to parse, visible dates, and passages it can lift without surrounding context.
AI-search behaviour shifts every few months. We re-run the source studies (Indig, RESONEO, AirOps × Indig 2026 State of AI Search, Cloudflare, our own server-log analyses) and refresh items where the underlying data has changed. The page lists the last update date in the changelog at the bottom. If something feels stale or you've spotted a new pattern, message Adrian on LinkedIn — that's how most updates start.
Yes. The checklist is written for founders, product owners, marketing leads, and SEO specialists alike. Every technical term is introduced before it's used. You don't need prior knowledge of BLUF, entity echoing, schema markup, or fan-out queries — the items explain them in plain English. The checklist is most useful when you can hand individual items to the right person on your team (writer, developer, ops) without translation.
Changelog
What's new in this update
- May 2026 — Added Tier V (Industry calibration) — 12 checks drawn from the OppAlerts LLM Ranking Factors study (Spearman correlations between 13 web signals and ChatGPT recommendation patterns across 144 industries, March 2026). The dominant signal varies dramatically by vertical — Wikidata correlates +0.87 with citations in homeowners insurance and −0.80 in senior home care; Reddit is positive in oil-change chains, negative in pest control. Tier V adds the sign-check rules and the per-industry top-signal lookup that turn one-size-fits-all GEO advice into per-vertical prioritisation. Total checks: 55 → 67.
- April 2026 — Expanded from 37 to 55 checks. Every item now separates How to check (diagnostic) from How to fix (implementation). Added 18 new items drawn from the RESONEO AIO/AIM Inspector deep-dive (Feb 2026), the AirOps × Kevin Indig 2026 State of AI Search report, the full April 2026 AI crawler roster (13 bots, including Anthropic's three-bot framework), and GPT-5.4 site-query behaviours from the March 11 2026 model update. Added an explicit anti-pattern: don't create llms.txt.
- Coverage added — Extractable passages (T1.10), Google's Information Gain patent (T1.11), entity consistency across platforms (T2.11), competitor co-occurrence (T2.12), visible publication dates (T2.13), subreddit mapping (T2.14), GPT-5.4 site-query coverage (T2.15), Brave Search indexing (T2.16), the full 13-bot crawler roster (T3.11), crawl-delay review (T3.12), sitemap + Bing submission (T3.13), Person schema identity graph (T3.14), entity-rich anchors (T3.15), AI agent readiness (T3.16), monthly citation matrix (T4.9), hidden grounding URL detection (T4.10), referrer regex refresh (T4.11), brand mention velocity (T4.12).
Keep going
You have the checklist. Now ship one tier.
Tier 3 first — fix the technical foundation. Then run Signal Check on the three pages that matter most. Then read how Haide runs GEO end-to-end.
Run Signal Check on a single page
Six signals, thirty seconds, any URL. The page-level companion to the 67-point checklist.
OpenHow Haide runs GEO
The engineering discipline behind the checklist — our approach to building organic growth for SaaS and eCommerce.
OpenBook a growth review
Thirty minutes with Haide on what it would look like to run this whole checklist on your stack with a team behind it.
OpenNeed an engineered growth plan?
Tech stack review, opportunity report, twelve-month roadmap. No lock-in, full knowledge transfer.
Reply within one business day.