What does GEO actually mean?

GEO stands for Generative Engine Optimisation — the practice of getting your content cited inside AI answers from ChatGPT, Claude, Perplexity, Gemini, Copilot, and Google's AI Overviews. The mechanics overlap with classical SEO (technical foundations, schema, content quality) but the targets are different: SEO ranks pages, GEO earns the quote inside an AI's synthesised answer.

What does the industry calibration do?

Pick your industry from the picker (top right of the toolbar) and every check gets reweighted by what actually correlates with ChatGPT recommendations in your vertical. Items that track the industry's strongest signal get a Critical badge and float to the top of their tier. Items tracking a signal that's *negative* in your vertical (e.g. Wikidata in senior home-care, Reddit in pest control) get a Counter-productive badge and drop to the bottom — so you don't waste budget on tactics that hurt you. The data is from the OppAlerts LLM Ranking Factors study (Spearman correlation across 13 web signals × 144 industries × ChatGPT 5.4, March 2026). The reweight only changes priority, never hides the universal items — Tier I writing rules and Tier III technical hygiene apply to everyone.

What if my industry isn't on the list?

OppAlerts covers 144 industries — the most-asked-about commercial verticals across SaaS, eCommerce, services, brands, insurance, healthcare, hospitality, and trades. If your exact niche isn't there, pick the closest analogue (e.g. Amazon-seller services has no direct match, but eCommerce SaaS or DTC brand verticals share most of the citation dynamics). When no industry is selected, the universal default applies and every check has equal weight — that's the baseline we built first, and it stays valid.

Do I need to do all 70 things?

No — and you shouldn't try in one quarter. Tier 3 (Crawlability) is non-negotiable: if AI crawlers can't reach you or parse your HTML, nothing else matters. Start there, then Tier 1 (the writing rules), then Tier 4 (so you can measure progress), then Tier 2 (the slowest but most compounding tier). Run Tier 5 (Industry calibration) before any of the others to know which signals matter most for your vertical — the lookup takes 10 minutes and reweights everything downstream. The realistic order is Tier 5 lookup in week one, Tier 3 fixes in week one, Tier 1 across a quarter, Tier 4 by month two, Tier 2 over the year.

How is my progress saved? Do I need an account?

No account. Your checked items are stored in your browser's local storage on your device. Reload the page, close the tab, come back next week — it's still there. Switch to a different browser or device and it's gone, because it never leaves your machine. We never see your progress and don't track it.

What's the print version for?

It's a clean, branded PDF that captures your current progress — checkboxes filled in based on what you've completed. Useful for circulating inside a team, attaching to a quarterly review, printing and pinning above a desk, or handing to a contractor as a brief. The print view strips out the interactive UI and lays the content out for paper, with each tier starting on a new page.

How often should I re-run this checklist?

Quarterly at minimum. During an active growth phase — building out AI visibility from zero, migrating a site, or recovering from a crawlability regression — monthly. Tier 3 items (Crawlability) need a monthly pass regardless, because robots.txt edits, Cloudflare setting changes, and new bot launches can break coverage without warning. Tier 2 items (Trust signals) compound over 6–12 months, so you're checking progress, not re-working them each quarter.

Can I automate any of these checks?

Some. Several Tier 3 and Tier 4 items are scriptable: FCP (T3.4) via the PageSpeed API, crawler access (T3.1, T3.2, T3.11) via curl + user-agent sweeps, server-log analysis (T4.2) via GoAccess or a Vector/Loki stack, GA4 referrer regex (T4.1, T4.11) via the GA4 API. Our own tool at haide.digital/tools/signal-check automates six of the signals here — crawler access, answer capsule, structured data, First Contentful Paint, technical integrity, and entity clarity. Tier 1 writing quality and Tier 2 trust signals require human judgement. Don't trust anyone who claims to fully automate either.

What if a new AI platform launches after I finish the checklist?

This page updates quarterly, so the bot list (T3.11), referrer regex (T4.11), and platform-specific items (T4.7, T2.16) stay current. The core principles don't change with new platforms — entity signals, crawlability, freshness, and extractable passages transfer directly. A new LLM from a new provider in Q3 2026 will almost certainly need the same five things the April 2026 roster needs: robots.txt access, a grounding source to search (Brave, Bing, or a proprietary index), Article/Organization schema to parse, visible dates, and passages it can lift without surrounding context.

How often does this checklist need updating?

AI-search behaviour shifts every few months. We re-run the source studies (Indig, RESONEO, AirOps × Indig 2026 State of AI Search, Cloudflare, our own server-log analyses) and refresh items where the underlying data has changed. The page lists the last update date in the changelog at the bottom. If something feels stale or you've spotted a new pattern, message Adrian on LinkedIn — that's how most updates start.

I'm not in SEO. Is this for me?

Yes. The checklist is written for founders, product owners, marketing leads, and SEO specialists alike. Every technical term is introduced before it's used. You don't need prior knowledge of BLUF, entity echoing, schema markup, or fan-out queries — the items explain them in plain English. The checklist is most useful when you can hand individual items to the right person on your team (writer, developer, ops) without translation.

Haide · free tool

GEO Checklist

70 checks for getting cited in ChatGPT, Claude, Perplexity, and Google AI Overviews. Five tiers in order of leverage. Pick your industry to reweight every check by what actually moves the needle in your vertical - drawn from the OppAlerts study of 144 industries. Each check has a diagnostic step and a fix, so a non-SEO can hand items to the right person on the team without translation. Free, no email, progress saves in your browser.

Kevin Indig - analysis of 1.2M ChatGPT answersAirOps × Indig - 2026 State of AI SearchRESONEO - AIO/AIM Inspector deep-dive (Feb 2026)OppAlerts - LLM Ranking Factors (Mar 2026)Cloudflare AI Audit (July 2025)Haide server-log analyses (2025-2026)

Your progress0/70 · 0%

Print PDFPrint

Only unchecked

How AI reads your page

Tier 1 · 0/13 done · 0%

The writing rules behind AI citations. What gets quoted, what gets skipped, and where on the page the quotable lines need to live. Drawn from an analysis of 1.2 million ChatGPT answers in 2026.

Third-party trust signals

Tier 2 · 0/16 done · 0%

Outside voices count more than your own. Reviews, comparisons, YouTube, Reddit, the Knowledge Graph. Without these, an AI model has no record that you exist beyond your own marketing pages.

Crawlability and machine-readability

Tier 3 · 0/17 done · 0%

Whether AI crawlers can reach the page and parse what they find. Cloudflare blocks the major bots by default since July 2025. Schema declares who you are. Speed and rendering decide whether the page is even read.

Measurement and tracking

Tier 4 · 0/12 done · 0%

Google Analytics hides AI traffic. Search Console has a hidden regex filter that surfaces conversational queries. Server logs see what analytics never will. Build a baseline, watch the trend, make decisions from real numbers.

Industry calibration and evidence interpretation

Tier 5 · 0/12 done · 0%

The dominant signal is different in every industry. Wikidata correlates +0.87 with citations in homeowners insurance and −0.80 in senior home care. Reddit is positive in oil-change chains, negative in pest control. Before applying tiers I–IV uniformly, look up your vertical and reweight. Drawn from the OppAlerts LLM Ranking Factors study — 13 signals × 144 industries × ChatGPT 5.4.

Don't do this

What to skip

Patterns widely promoted as GEO best practice that don't hold up under measurement. Ignore them — and flag anyone selling them as a service.

Don't treat llms.txt as a ranking lever
No measured AI-citation lift yet (April 2026)
Why
llms.txt is a proposed standard — similar to robots.txt in spirit — for stating, in a clean machine-readable form, what a site is and what content matters. It is cheap to publish and harmless. What it is not, today, is a citation mechanism: no major AI platform reads it as a ranking input as of April 2026, Google representatives have said AI systems don't use it, and SE Ranking's analysis of nearly 300,000 domains found no correlation between llms.txt presence and AI citation frequency. So publishing one as a low-cost forward bet on it becoming a standard is reasonable; budgeting real time against it for citations, or buying it as a guaranteed AI-ranking service, is the mistake. Spend the GEO effort on visible-HTML passage structure, entity coverage, and brand mentions instead.
How to check
Check yourdomain.com/llms.txt. Publishing one is fine as a low-cost, forward-looking artifact — just don't expect AI citations from it.
Keep the effort proportional: a maintained llms.txt is minutes of work; it should never displace the visible-HTML, entity, and brand-mention work that actually moves citations.
The red flag is anyone selling llms.txt as a guaranteed AI-ranking lever, or charging a premium for it as the centrepiece of a GEO program.
Source: SE Ranking 300K-domain study + public Google guidance + RESONEO research

Why this checklist exists

The opportunity review Haide runs before any GEO engagement.

Every quarter, three or four data sources move the picture of how AI search actually works: Kevin Indig's analysis of 1.2 million ChatGPT answers, RESONEO's reverse engineering of GPT-5 and Claude, Cloudflare's report on which crawlers reach which sites, and our own server-log work with clients across SaaS and eCommerce. Each source produces a few specific, actionable rules. None of them, on their own, are a checklist.

We assembled the 70 checks because that's what we needed internally — a single document that survives the next platform shift and tells a non-specialist exactly what to do, why, and in what order. Tier 5 (the 10-minute industry calibration) sets weighting, Tier 3 fixes the technical foundation, Tier 1 covers how AI reads the page, Tier 4 makes progress measurable, and Tier 2 carries the slowest but highest-leverage compounding work.

Made public for the same reason Signal Check is public: seeing the checklist applied to your own site is more useful than any pitch deck we could write. Use it on your domain. Use it on a competitor's. Print the PDF and pin it above the desk of whoever owns the website.

Run Signal Check on a single page·See how Haide runs GEO

Sources & methodology

Every check traces back to a study, a patent, or a server log.

Each item in this checklist carries an evidence badge — ChatGPT-specific for ChatGPT-specific data, Google AI Overviews for Google AI Overviews / AI Mode work, Multi-LLM for findings tested across multiple systems, and Universal for foundational best practices that apply regardless of which model crawls the page. Below are the studies, datasets, and field-testing programmes the 70 checks are drawn from. Hover any badge on a check to see the title; the source link sits inside each item's expanded view too.

Kevin Indig — Growth Memo 2026ChatGPT-specific
Statistical analysis of 1.2 million ChatGPT answers identifying the writing patterns, opening structures, and on-page features that get quoted versus skipped.
Scope
1.2M ChatGPT answers, 2026
Informs
Tier I openers, headings, lists, citation anchors (T1.1–T1.9, T2.1, T2.2)
OppAlerts — LLM Ranking FactorsChatGPT-specific
Spearman correlation between 13 web signals (Wikidata, Reddit, Common Crawl, Backlink Authority, etc.) and ChatGPT 5.4 recommendation patterns across 144 industries. Source for the entire industry-calibration layer and the Wikidata/Reddit sign-check rules.
Scope
13 signals × 144 industries × ChatGPT 5.4, March 2026
Informs
Tier V (T5.1–T5.12) and the per-industry picker that reweights every check
Zyppy / Signal — AI Citation Ranking Factors (Cyrus Shepard)Multi-LLM
Evidence-weighted meta-analysis of 54 published experiments, patents, and case studies, scoring 23 citation factors by repeatability, study scale, and official support. Ranks URL accessibility, search rank, fan-out, and preview control (nosnippet) as the strongest signals; finds llms.txt has no measurable effect.
Scope
54 experiments across ChatGPT, Gemini, Perplexity, 2026
Informs
T1.12 (intent-format match), T1.13 (language match), T3.17 (preview control / nosnippet)
RESONEO — AIO/AIM Inspector deep-diveGoogle AI Overviews
Reverse-engineering of Google AI Overviews and AI Mode citation behaviour, plus GPT-5.4 site: query analysis. Identifies the specific schema, heading, and freshness signals Google's AI surfaces respond to.
Scope
Google AI Overviews + AI Mode + GPT-5.4, Feb–March 2026
Informs
T1.10, T2.4 (Reddit Posts), T2.7 (Knowledge Graph), T2.13, T3.7, T3.10, T4.1
AirOps × Indig — 2026 State of AI SearchChatGPT-specific
Cross-platform survey of AI-search citation patterns and traffic conversion rates. Used to calibrate measurement targets and the citation-source breakdown.
Scope
Multi-LLM survey, 2026 — but ChatGPT-weighted sample
Informs
T1.5 (citation source breakdown), T4.1 (AI traffic baseline)
Cloudflare AI Audit (July 2025)Multi-LLM
Default-block policy for AI bots across the Cloudflare network, plus the published roster of 13 known AI crawlers (GPTBot, ClaudeBot, Google-Extended, Perplexity-User, Applebot-Extended, etc.).
Scope
13-bot crawler roster, July 2025 onward
Informs
T3.1 (Cloudflare bot management), T3.2 (robots.txt), T3.11 (full bot roster)
Anthropic / OpenAI / Google AI bot documentationMulti-LLM
Official docs for ClaudeBot, ClaudeBot-User, anthropic-ai, GPTBot, ChatGPT-User, Google-Extended, Perplexity-User, Applebot-Extended, and the Apple/Amazon/ByteDance/Meta bots. Source-of-truth for which bots to allow versus block per use case.
Scope
Multi-LLM crawler infrastructure, 2024–2026
Informs
T3.3 (allow ClaudeBot for grounded answers), T3.4 (per-bot rules)
Google patent US12013887B2 — Information GainGoogle AI Overviews
Filed mechanism for ranking content based on contribution of new information beyond what's already in the index. Underpins why pages that repeat existing content get suppressed in AI Overviews.
Scope
Google ranking systems
Informs
T1.11 (Information Gain — write what's not already on the web)
Schema.org + Google structured data documentationUniversal
The structured-data vocabulary and Google's rich-results documentation. Schema is recommended on this checklist only for classic Google rich results (FAQ, Article, BreadcrumbList, LocalBusiness) — Ahrefs' Aug 2025–Mar 2026 causal study (1,885 pages adding schema vs 4,000 controls) found no measurable effect on AI citations.
Scope
Classic Google rich results — NOT for AI visibility
Informs
T3.5 (FAQ schema), T3.6 (Article schema), T3.7 (LocalBusiness)
Ahrefs — Schema correlation study (2025–2026)Multi-LLM
Causal study comparing 1,885 pages that added JSON-LD vs 4,000 matched controls. Result: ChatGPT +2.2% (null), Google AI Mode +2.4% (null), Google AI Overviews −4.6% (significant decline). Five major AI systems extracted only visible HTML during retrieval. Schema is for classic search rich results, not AI citations.
Scope
5 AI systems, 5,885 pages, Aug 2025–Mar 2026
Informs
Why this checklist does NOT recommend schema or llms.txt for AI visibility
Haide Digital — server-log analyses & field testingMulti-LLM
Server-log analyses, multi-persona LLM testing, GSC regex patterns, and topical-authority methodology developed across active client engagements (Acronis, MobiSystems, PhoneArena, Bondex, and others, 2025–2026).
Scope
Applied across active client work, multi-LLM
Informs
T2.11, T2.12, T2.14, T4.2, T4.3, T4.10, T4.11, T4.12 and the methodology threading the full checklist

Considered and rejected

JSON-LD / schema for AI citations — Ahrefs causal study (5,885 pages, 5 AI systems, Aug 2025–Mar 2026) found no measurable lift, with a small significant decline on Google AI Overviews. Schema stays in this checklist only for classic Google rich results.
llms.txt — Not honored by any major AI crawler. SE Ranking 300K-domain study and public Google guidance both confirm zero measurable effect on indexing or citations. Listed as an explicit anti-pattern (A1).
AI-content detectors as quality signals — No causal evidence that LLM citation engines penalise AI-assisted writing per se. The signal that matters is whether the passage is quotable and grounded, not whether a human typed it.

Frequently asked questions

Changelog

What's new in this update

May 2026 (2) - Added 3 checks from the Zyppy/Signal “AI Citation Ranking Factors” study (Cyrus Shepard, an evidence-weighted review of 54 experiments): preview control / nosnippet (Tier III, ranked the #4 factor overall, now flagged critical), intent-format match (Tier I), and language match (Tier I). The same study confirmed our existing position that llms.txt has no measurable effect on AI citations. Total checks: 67 → 70.
May 2026 - Added Tier V (Industry calibration) - 12 checks drawn from the OppAlerts LLM Ranking Factors study (Spearman correlations between 13 web signals and ChatGPT recommendation patterns across 144 industries, March 2026). The dominant signal varies dramatically by vertical - Wikidata correlates +0.87 with citations in homeowners insurance and −0.80 in senior home care; Reddit is positive in oil-change chains, negative in pest control. Tier V adds the sign-check rules and the per-industry top-signal lookup that turn one-size-fits-all GEO advice into per-vertical prioritisation. Total checks: 55 → 67.
April 2026 - Expanded from 37 to 55 checks. Every item now separates How to check (diagnostic) from How to fix (implementation). Added 18 new items drawn from the RESONEO AIO/AIM Inspector deep-dive (Feb 2026), the AirOps × Kevin Indig 2026 State of AI Search report, the full April 2026 AI crawler roster (13 bots, including Anthropic's three-bot framework), and GPT-5.4 site-query behaviours from the March 11 2026 model update. Added an explicit anti-pattern: don't create llms.txt.
Coverage added - Extractable passages (T1.10), Google's Information Gain patent (T1.11), entity consistency across platforms (T2.11), competitor co-occurrence (T2.12), visible publication dates (T2.13), subreddit mapping (T2.14), GPT-5.4 site-query coverage (T2.15), Brave Search indexing (T2.16), the full 13-bot crawler roster (T3.11), crawl-delay review (T3.12), sitemap + Bing submission (T3.13), Person schema identity graph (T3.14), entity-rich anchors (T3.15), AI agent readiness (T3.16), monthly citation matrix (T4.9), hidden grounding URL detection (T4.10), referrer regex refresh (T4.11), brand mention velocity (T4.12).

Keep going

You have the checklist. Now ship one tier.

Tier 3 first — fix the technical foundation. Then run Signal Check on the three pages that matter most. Then read how Haide runs GEO end-to-end.

Need an engineered growth plan?

Tech stack review, opportunity report, twelve-month roadmap. No lock-in, full knowledge transfer.

Book a discovery call Back to top

Reply within one business day.

GEO Checklist

Open with a 40–60 word answer in the first 150 words

Lead with the answer, then the context (BLUF)

Spread the facts: 44 / 31 / 25 across the page

Phrase H2 headings as questions and echo the subject in the answer

Mention 15–20% specific names (entity density)

Pack the middle of every paragraph with information

Write in the voice of an analyst (subjectivity around 0.47)

Aim for a Flesch-Kincaid reading level around 16

End with facts, not opinions or CTAs

Structure content as extractable 2–3 sentence passages

Add genuine information gain — don't duplicate your own angles

Match the page format to the query's intent

Publish in the language of the query

Build a comparison page for every named competitor (X vs Y)

Show real prices on the page (no "Contact Sales")

Be present on at least three review platforms

Have a real, organic presence on Reddit

Publish at least 5–10 product videos on YouTube

Get a Wikipedia entry — or sourced mentions in related ones

Get into Google's Knowledge Graph (an MID identifier)

Earn citations from authoritative third-party publications

Use one consistent spelling of your brand name everywhere

Maintain a real founder LinkedIn profile

Align brand entity data across every platform

Build competitor co-occurrence in third-party content

Show visible publication + lastModified dates on every piece of content

Map Reddit activity to specific target subreddits

Cover GPT-5.4 site: queries — pricing, about, comparison, case studies, FAQ

Get indexed in Brave Search (the Claude proxy)

Check your Cloudflare AI bot settings (this is the trap)

Allow the major AI bots in robots.txt

Explicitly allow OAI-SearchBot and the other live-fetch bots

Get First Contentful Paint under 1 second

Render content on the server (or as static HTML)

Add Organization schema on your homepage

Add Product or Service schema on the relevant pages

Add FAQPage schema on real Q&A sections

Add Article and Author schema to every blog post

Update content regularly and signal it (lastModified)

Allow all 13 major AI crawlers (April 2026 roster)

Remove aggressive Crawl-delay directives for AI bots

Maintain an XML sitemap with cornerstone URLs + lastmod

Add Person schema with sameAs identity graph to every author page

Use entity-rich anchor text on internal links

Make the site AI-agent ready (ARIA labels, action microdata)

Don't suppress snippets (nosnippet, max-snippet:0, data-nosnippet)

Create an AI Search channel in GA4 (custom regex)

Run server-log analysis with GoAccess (or similar)

Use a regex filter in Google Search Console to surface AI queries

Test parametric visibility — 5 prompts with web search OFF

Test dynamic visibility — same 5 prompts with web search ON

Use the RESONEO Chrome extension to monitor query fan-out

Track your ranking on Brave Search (it's the proxy for Claude)

Track citations manually each quarter (10 queries, 4 platforms)

Run a structured citation matrix: 10 prompts × 4 platforms, monthly

Detect hidden grounding URLs with the AIO/AIM Inspector

Update the AI referrer regex quarterly

Track brand mention velocity as a trend line

Look up your industry's top signal before applying generic GEO advice

Check the Wikidata sign before investing in entity work

Check the Reddit sign before seeding subreddit presence

Identify your industry's topical-authority outlier and study them

Test the same query under 5 buyer-persona framings — citations shift by persona

Sense-check correlation strength — don't bet retainer budget on n < 15

Review your SE Outbound Links footprint — SERP features, knowledge panels, related searches

Track Best Search Engine Rank for money keywords, not average rank

Verify your domain is indexed in Common Crawl

Combine the OppAlerts external benchmark with your internal citation matrix

Cross-validate OppAlerts findings against Claude, Perplexity, and Gemini

Don't double-count Backlink Count, BL Authority, and PageRank as separate workstreams

Don't treat llms.txt as a ranking lever

The opportunity review Haide runs before any GEO engagement.

Every check traces back to a study, a patent, or a server log.

Frequently asked questions

What's new in this update

You have the checklist. Now ship one tier.

Run Signal Check on a single page

How Haide runs GEO

Book a growth review