Skip to content

Haide · free tool

GEO Checklist

67 checks for getting cited in ChatGPT, Claude, Perplexity, and Google AI Overviews. Five tiers in order of leverage. Pick your industry to reweight every check by what actually moves the needle in your vertical — drawn from the OppAlerts study of 144 industries. Each check has a diagnostic step and a fix, so a non-SEO can hand items to the right person on the team without translation. Free, no email, progress saves in your browser.

Kevin Indig — analysis of 1.2M ChatGPT answersAirOps × Indig — 2026 State of AI SearchRESONEO — AIO/AIM Inspector deep-dive (Feb 2026)OppAlerts — LLM Ranking Factors (Mar 2026)Cloudflare AI Audit (July 2025)Haide server-log analyses (2025–2026)
Your progress0/67 · 0%
Print

How AI reads your page

Tier 1 · 0/11 done · 0%

The writing rules behind AI citations. What gets quoted, what gets skipped, and where on the page the quotable lines need to live. Drawn from an analysis of 1.2 million ChatGPT answers in 2026.

Third-party trust signals

Tier 2 · 0/16 done · 0%

Outside voices count more than your own. Reviews, comparisons, YouTube, Reddit, the Knowledge Graph. Without these, an AI model has no record that you exist beyond your own marketing pages.

Crawlability and machine-readability

Tier 3 · 0/16 done · 0%

Whether AI crawlers can reach the page and parse what they find. Cloudflare blocks the major bots by default since July 2025. Schema declares who you are. Speed and rendering decide whether the page is even read.

Measurement and tracking

Tier 4 · 0/12 done · 0%

Google Analytics hides AI traffic. Search Console has a hidden regex filter that surfaces conversational queries. Server logs see what analytics never will. Build a baseline, watch the trend, make decisions from real numbers.

Industry calibration and evidence interpretation

Tier 5 · 0/12 done · 0%

The dominant signal is different in every industry. Wikidata correlates +0.87 with citations in homeowners insurance and −0.80 in senior home care. Reddit is positive in oil-change chains, negative in pest control. Before applying tiers I–IV uniformly, look up your vertical and reweight. Drawn from the OppAlerts LLM Ranking Factors study — 13 signals × 144 industries × ChatGPT 5.4.

Don't do this

What to skip

Patterns widely promoted as GEO best practice that don't hold up under measurement. Ignore them — and flag anyone selling them as a service.

  • Don't create an llms.txt file

    No AI system currently uses it (April 2026)

    Why

    llms.txt is a proposed standard — similar to robots.txt in spirit — for telling AI crawlers what content to prioritise. Despite widespread advocacy, no major AI platform uses it as of April 2026. Google representatives have publicly said AI systems don't read it. SE Ranking's analysis of nearly 300,000 domains found no correlation between llms.txt presence and AI citation frequency; adoption sits at 10.13% while the effect measures to zero. Of the top 50 AI-cited domains, only one implements it. It's a time-sink that produces nothing, and it signals to technical readers that the team is chasing hype rather than measuring outcomes.

    How to check

    1. Check yourdomain.com/llms.txt. If it exists — consider whether to leave it or delete it (it doesn't harm, but it also doesn't help).
    2. More importantly: if someone on your team is proposing to create one, reference this item.
    3. If an agency or consultant is selling llms.txt as a GEO service — this is a red flag about their methodology.

    Source: SE Ranking 300K-domain study + public Google guidance + RESONEO research

Why this checklist exists

The opportunity review Haide runs before any GEO engagement.

Every quarter, three or four data sources move the picture of how AI search actually works: Kevin Indig's analysis of 1.2 million ChatGPT answers, RESONEO's reverse engineering of GPT-5 and Claude, Cloudflare's report on which crawlers reach which sites, and our own server-log work with clients across SaaS and eCommerce. Each source produces a few specific, actionable rules. None of them, on their own, are a checklist.

We assembled the 67 checks because that's what we needed internally — a single document that survives the next platform shift and tells a non-specialist exactly what to do, why, and in what order. Tier 5 (the 10-minute industry calibration) sets weighting, Tier 3 fixes the technical foundation, Tier 1 covers how AI reads the page, Tier 4 makes progress measurable, and Tier 2 carries the slowest but highest-leverage compounding work.

Made public for the same reason Signal Check is public: seeing the checklist applied to your own site is more useful than any pitch deck we could write. Use it on your domain. Use it on a competitor's. Print the PDF and pin it above the desk of whoever owns the website.

Sources & methodology

Every check traces back to a study, a patent, or a server log.

Each item in this checklist carries an evidence badge — ChatGPT-specific for ChatGPT-specific data, Google AI Overviews for Google AI Overviews / AI Mode work, Multi-LLM for findings tested across multiple systems, and Universal for foundational best practices that apply regardless of which model crawls the page. Below are the studies, datasets, and field-testing programmes the 67 checks are drawn from. Hover any badge on a check to see the title; the source link sits inside each item's expanded view too.

  1. Statistical analysis of 1.2 million ChatGPT answers identifying the writing patterns, opening structures, and on-page features that get quoted versus skipped.

    Scope
    1.2M ChatGPT answers, 2026
    Informs
    Tier I openers, headings, lists, citation anchors (T1.1–T1.9, T2.1, T2.2)
  2. Spearman correlation between 13 web signals (Wikidata, Reddit, Common Crawl, Backlink Authority, etc.) and ChatGPT 5.4 recommendation patterns across 144 industries. Source for the entire industry-calibration layer and the Wikidata/Reddit sign-check rules.

    Scope
    13 signals × 144 industries × ChatGPT 5.4, March 2026
    Informs
    Tier V (T5.1–T5.12) and the per-industry picker that reweights every check
  3. Reverse-engineering of Google AI Overviews and AI Mode citation behaviour, plus GPT-5.4 site: query analysis. Identifies the specific schema, heading, and freshness signals Google's AI surfaces respond to.

    Scope
    Google AI Overviews + AI Mode + GPT-5.4, Feb–March 2026
    Informs
    T1.10, T2.4 (Reddit Posts), T2.7 (Knowledge Graph), T2.13, T3.7, T3.10, T4.1
  4. Cross-platform survey of AI-search citation patterns and traffic conversion rates. Used to calibrate measurement targets and the citation-source breakdown.

    Scope
    Multi-LLM survey, 2026 — but ChatGPT-weighted sample
    Informs
    T1.5 (citation source breakdown), T4.1 (AI traffic baseline)
  5. Default-block policy for AI bots across the Cloudflare network, plus the published roster of 13 known AI crawlers (GPTBot, ClaudeBot, Google-Extended, Perplexity-User, Applebot-Extended, etc.).

    Scope
    13-bot crawler roster, July 2025 onward
    Informs
    T3.1 (Cloudflare bot management), T3.2 (robots.txt), T3.11 (full bot roster)
  6. Official docs for ClaudeBot, ClaudeBot-User, anthropic-ai, GPTBot, ChatGPT-User, Google-Extended, Perplexity-User, Applebot-Extended, and the Apple/Amazon/ByteDance/Meta bots. Source-of-truth for which bots to allow versus block per use case.

    Scope
    Multi-LLM crawler infrastructure, 2024–2026
    Informs
    T3.3 (allow ClaudeBot for grounded answers), T3.4 (per-bot rules)
  7. Filed mechanism for ranking content based on contribution of new information beyond what's already in the index. Underpins why pages that repeat existing content get suppressed in AI Overviews.

    Scope
    Google ranking systems
    Informs
    T1.11 (Information Gain — write what's not already on the web)
  8. The structured-data vocabulary and Google's rich-results documentation. Schema is recommended on this checklist only for classic Google rich results (FAQ, Article, BreadcrumbList, LocalBusiness) — Ahrefs' Aug 2025–Mar 2026 causal study (1,885 pages adding schema vs 4,000 controls) found no measurable effect on AI citations.

    Scope
    Classic Google rich results — NOT for AI visibility
    Informs
    T3.5 (FAQ schema), T3.6 (Article schema), T3.7 (LocalBusiness)
  9. Causal study comparing 1,885 pages that added JSON-LD vs 4,000 matched controls. Result: ChatGPT +2.2% (null), Google AI Mode +2.4% (null), Google AI Overviews −4.6% (significant decline). Five major AI systems extracted only visible HTML during retrieval. Schema is for classic search rich results, not AI citations.

    Scope
    5 AI systems, 5,885 pages, Aug 2025–Mar 2026
    Informs
    Why this checklist does NOT recommend schema or llms.txt for AI visibility
  10. Server-log analyses, multi-persona LLM testing, GSC regex patterns, and topical-authority methodology developed across active client engagements (Acronis, MobiSystems, PhoneArena, Bondex, and others, 2025–2026).

    Scope
    Applied across active client work, multi-LLM
    Informs
    T2.11, T2.12, T2.14, T4.2, T4.3, T4.10, T4.11, T4.12 and the methodology threading the full checklist

Considered and rejected

  • JSON-LD / schema for AI citations — Ahrefs causal study (5,885 pages, 5 AI systems, Aug 2025–Mar 2026) found no measurable lift, with a small significant decline on Google AI Overviews. Schema stays in this checklist only for classic Google rich results.
  • llms.txt Not honored by any major AI crawler. SE Ranking 300K-domain study and public Google guidance both confirm zero measurable effect on indexing or citations. Listed as an explicit anti-pattern (A1).
  • AI-content detectors as quality signals — No causal evidence that LLM citation engines penalise AI-assisted writing per se. The signal that matters is whether the passage is quotable and grounded, not whether a human typed it.

Frequently asked questions

Changelog

What's new in this update

  • May 2026 — Added Tier V (Industry calibration) — 12 checks drawn from the OppAlerts LLM Ranking Factors study (Spearman correlations between 13 web signals and ChatGPT recommendation patterns across 144 industries, March 2026). The dominant signal varies dramatically by vertical — Wikidata correlates +0.87 with citations in homeowners insurance and −0.80 in senior home care; Reddit is positive in oil-change chains, negative in pest control. Tier V adds the sign-check rules and the per-industry top-signal lookup that turn one-size-fits-all GEO advice into per-vertical prioritisation. Total checks: 55 → 67.
  • April 2026 — Expanded from 37 to 55 checks. Every item now separates How to check (diagnostic) from How to fix (implementation). Added 18 new items drawn from the RESONEO AIO/AIM Inspector deep-dive (Feb 2026), the AirOps × Kevin Indig 2026 State of AI Search report, the full April 2026 AI crawler roster (13 bots, including Anthropic's three-bot framework), and GPT-5.4 site-query behaviours from the March 11 2026 model update. Added an explicit anti-pattern: don't create llms.txt.
  • Coverage added — Extractable passages (T1.10), Google's Information Gain patent (T1.11), entity consistency across platforms (T2.11), competitor co-occurrence (T2.12), visible publication dates (T2.13), subreddit mapping (T2.14), GPT-5.4 site-query coverage (T2.15), Brave Search indexing (T2.16), the full 13-bot crawler roster (T3.11), crawl-delay review (T3.12), sitemap + Bing submission (T3.13), Person schema identity graph (T3.14), entity-rich anchors (T3.15), AI agent readiness (T3.16), monthly citation matrix (T4.9), hidden grounding URL detection (T4.10), referrer regex refresh (T4.11), brand mention velocity (T4.12).

Need an engineered growth plan?

Tech stack review, opportunity report, twelve-month roadmap. No lock-in, full knowledge transfer.

Reply within one business day.