Print preview · close the dialog to keep editing

Haide DigitalHAIDE.DIGITAL · 2026

Free tool · Print edition

GEO Checklist

37 checks for AI search visibility, in 4 tiers.

Tier 3 first — fix the technical foundation. Tier 1 next — write so AI can extract a quote. Tier 4 next — make sure you can measure. Tier 2 last — the slowest tier and the one with the longest payoff. Use the checkboxes as you ship; the printable mirrors your saved progress.

Generated

18 April 2026

Your progress

0 of 37

Completion

Kevin Indig — analysis of 1.2M ChatGPT answers (Growth Memo, 2026)

RESONEO — GPT-5 and Claude reverse engineering (think.resoneo.com)

Cloudflare AI Audit (July 2025)

Adrian Nikolov — server-log analyses, Haide Digital (2025–2026)

How AI reads your page

Tier 1 · 0/9 done

The writing rules behind AI citations. What gets quoted, what gets skipped, and where on the page the quotable lines need to live. Drawn from an analysis of 1.2 million ChatGPT answers in 2026.

T1.1

Open with a 40–60 word answer in the first 150 words

Strongest single predictor of being cited

WhyAn AI model reads roughly the first third of a page before deciding whether to quote it. If the opening paragraph is a clear, factual answer to the page's main question — what we call the answer capsule — the model lifts it almost verbatim. If the opening is a brand intro or a vague hook, the model moves to the next site.

How

01Make the first paragraph a direct answer to the question the page is meant to address.
02Keep it between 40 and 60 words. Count them.
03No links inside this paragraph. AI models read inline links as the author hedging.
04No marketing adjectives — perfect, leading, world-class. Stick to facts and specifics.
05Template: "[Subject] is [definition]. [Key data point or specification]."

Source: Kevin Indig — Growth Memo 2026

T1.2

Lead with the answer, then the context (BLUF)

44.2% of citations come from the first third of the page

WhyBLUF stands for Bottom Line Up Front. AI models were trained on journalism and academic papers, where the most important fact sits at the top. The model scans for that frame quickly and reads the rest through it. Burying the answer under three paragraphs of setup means the model leaves before it gets there.

How

01Make the first sentence answer the question. Don't introduce a topic.
02Make the second sentence add a number, a date, or a specific.
03No narrative intros — "In today's fast-paced digital world…" is dead weight.
04Review your top 10 pages. Count how many open with a factual answer versus a context-setting paragraph.

Source: Kevin Indig — Growth Memo 2026

T1.3

Spread the facts: 44 / 31 / 25 across the page

After 90% of the page, citations drop to almost zero

WhyCitations don't come from anywhere on the page. 44% are pulled from the first third, 31% from the middle, 25% from the final third — but only the part above your footer. After the 90% mark, almost nothing. Plan the page like a newspaper article: front-load, but keep facts in the closing third too.

How

01Don't save the best facts for the conclusion. Distribute them across the page in roughly that ratio.
02Make the conclusion quotable: end with facts, not "in summary, you should think about…"
03If you write a TL;DR, place it at the end of the first third — not after the conclusion.
04Footer content (CTAs, related articles, newsletter signups) is essentially invisible to AI. Don't put quotable facts there.

Source: Kevin Indig — Growth Memo 2026

T1.4

Phrase H2 headings as questions and echo the subject in the answer

78.4% of question-style citations come from headings

WhyAI treats an H2 like a user prompt and the paragraph below it like the answer. The first word of that paragraph should repeat the subject from the heading — the technique is called entity echoing. It tells the model the heading and the paragraph belong to the same question, which doubles the chance of a citation.

How

01Convert descriptive H2s into direct questions where the page genuinely answers them.
02Bad: H2 "The history of SEO" → P "It started in the nineties…"
03Good: H2 "When did SEO start?" → P "SEO started in…"
04The first word of the answer paragraph should be the subject from the question.
05Don't force this everywhere — only where there's a real Q-and-A pattern on the page.

Source: Kevin Indig — Growth Memo 2026

T1.5

Mention 15–20% specific names (entity density)

20.6% in cited text vs 5–8% in average copy

WhyCited passages contain about three times more specific names — brands, tools, standards, versions, people — than average web copy. AI models are wary of hallucinating, so they treat concrete names as anchors that can be verified. Generic copy gives them nothing to anchor to.

How

01Name specific brands, tools, standards, protocols, versions. Don't be precious about it.
02Don't be afraid to namedrop competitors — it raises your own trustworthiness with the model, even when the names belong to someone else.
03Bad: "Top tools help businesses scale."
04Good: "Top tools include Salesforce, HubSpot, and Pipedrive."
05Take a paragraph and count its proper nouns. Below 5%? Add anchors.

Source: Kevin Indig — Growth Memo 2026

T1.6

Pack the middle of every paragraph with information

53% of citations come from the middle of a paragraph

WhyAI doesn't only read the first sentence of each paragraph. It looks for the sentence with the highest information density — the one packed with names, numbers, and specifics. Information gain is the term for it. Slow build-ups, suspense, and gradual reveals get read as a lack of confidence.

How

01Don't force the answer into the first sentence of every paragraph.
02Every sentence has to add something new. If a sentence can be removed without changing the meaning, it's filler — cut it.
03The middle sentence of a paragraph should be the one with the most names and numbers.
04Test: delete each sentence one by one. If the meaning survives, the sentence didn't earn its place.

Source: Kevin Indig — Growth Memo 2026

T1.7

Write in the voice of an analyst (subjectivity around 0.47)

Halfway between Wikipedia (0.1) and marketing hype (0.9)

WhyThere's a 0-to-1 measure of how subjective a piece of writing is. Cited passages sit at 0.47 — almost exactly halfway. Wikipedia (0.1) is too dry to interpret. Marketing copy (0.9) is too biased to trust. The sweet spot is fact plus interpretation, the way a financial analyst writes: here's what's true, and here's what it means.

How

01Template: [verifiable fact]. [analytical implication].
02Example: "iPhone 15 has the A16 chip (fact), which makes it well suited to creators shooting in low light (analysis)."
03Don't sound like Wikipedia — too dry to be quoted as interpretation.
04Don't sound like product copy — "industry-leading", "cutting-edge" gets filtered out.

Source: Kevin Indig — Growth Memo 2026

T1.8

Aim for a Flesch-Kincaid reading level around 16

16 gets cited. 19 gets skipped.

WhyFlesch-Kincaid is a readability score — roughly the school grade needed to read the text. Cited content reads like The Economist (around 14–16), not an academic paper (18–22). This isn't about dumbing things down. It's about shorter sentences and simpler words wherever those don't sacrifice precision.

How

01Paste your top page into hemingwayapp.com or any Flesch-Kincaid checker.
02Above 18 — start cutting sentence length.
03Average sentence under 20 words. Ideal range: 12–18.
04Use complex words only when a simpler one loses meaning.
05Reference scores: The Economist ≈ 13–15. Harvard Business Review ≈ 14–16. Academic paper ≈ 18–22.

Source: Kevin Indig — Growth Memo 2026

T1.9

End with facts, not opinions or CTAs

24.7% of citations come from the last third

WhyThe closing paragraph isn't a footer. It carries roughly a quarter of the page's citation potential. But most conclusions get used up on a CTA or a soft summary, which gives the AI nothing to quote. Keep the closing paragraph quotable — facts and synthesis — and put the CTA in its own block underneath.

How

01End the page with a synthesis of the key facts from the body.
02Avoid CTAs, soft summaries, or generic 'you decide' statements in the closing paragraph.
03Template: "In summary, [key fact 1], [key fact 2], and [data point]."
04Place the CTA in a separate block below the conclusion, not inside it.

Source: Kevin Indig — Growth Memo 2026

Third-party trust signals

Tier 2 · 0/10 done

Outside voices count more than your own. Reviews, comparisons, YouTube, Reddit, the Knowledge Graph. Without these, an AI model has no record that you exist beyond your own marketing pages.

T2.1

Build a comparison page for every named competitor (X vs Y)

GPT-5.4 thinking mode literally runs site:yourdomain.com [competitor]

WhyWhen ChatGPT is in thinking mode and a buyer asks about you, it runs a search like "site:yourdomain.com [competitor name]". If you don't have a page for that competitor, it moves on and quotes whoever does. One generic "alternatives" page won't catch this — you need a dedicated page per competitor.

How

01Type "[your product] vs" into Google and look at autocomplete.
02The first 5–10 suggestions are your real competitors based on actual searches.
03Create a separate page for each. Not a "Top alternatives" page — one page per matchup.
04URL pattern: /[product]-vs-[competitor]
05The page itself follows the Tier 1 rules: answer capsule, BLUF, comparison table.

Source: RESONEO + Adrian Nikolov field testing

T2.2

Show real prices on the page (no "Contact Sales")

35× higher chance of pricing-page citation

WhyGPT-5.4 thinking mode runs a pricing site: query for most commercial questions. If your pricing page says "Contact Sales" or "Get a Quote", it gets skipped — the model quotes a competitor who shows numbers. Exact prices aren't required. Even ranges work.

How

01Exact pricing isn't required. Ranges work: "Starting at €99/month."
02Tiered pricing with brief feature lists works extremely well.
03For an enterprise tier with custom pricing — show it as the top tier with "Custom" and ranges for the others.
04Marketing trade-off: you give up some lead capture, you gain AI visibility on commercial searches.

Source: RESONEO — GPT-5.4 site: query analysis

T2.3

Be present on at least three review platforms

3× citation multiplier on purchase-intent queries

WhyWhen someone asks an AI for a recommendation, it favours third parties over your own site. Your About page is suspect by definition. A G2 or Trustpilot review from a real user is credible. Spread is more important than depth — five reviews on three platforms beats fifty on one.

How

01Aim for at least three platforms. Not one with 500 reviews.
025 on G2 + 5 on Capterra + 10 on Trustpilot beats 50 on a single site.
03Ask happy existing customers. A simple email with a direct link is enough.
04Reply to reviews — both positive and negative. Replies count as fresh signal.

Source: Indig + RESONEO citation source analysis

T2.4

Have a real, organic presence on Reddit

4× citation multiplier — the highest of any platform

WhyReddit mentions carry the highest citation multiplier in the public data. AI models read Reddit threads as peer recommendations: someone asked a question, the community ranked the answers via upvotes, the model trusts the result. Bought mentions don't work — the spam patterns are easy for models to spot and discount.

How

01Don't buy mentions. Models will get more aggressive about spam patterns over time.
02Find 3–5 subreddits relevant to your space.
03Real participation only: customer support, sharing expertise, answering questions.
04Models read the threading and the upvote counts — top-voted comments in r/SaaS carry weight.

Source: Kevin Indig — citation source breakdown

T2.5

Publish at least 5–10 product videos on YouTube

#1 cited domain in Google's AI Overviews

WhyYouTube has overtaken Wikipedia and Reddit as the most-cited domain in AI Overviews — up 34% in six months. Google AI reads video transcripts, descriptions and titles. If you have expertise but it isn't on video, AI doesn't know about it. The production quality matters less than the clarity of the spoken words.

How

01Aim for at least 5–10 videos covering your product or service.
02Title each video with a specific question (entity echoing works on YouTube too).
03Put a full transcript or a detailed summary with proper names in the description.
04Add chapters and timestamps — AI Overviews quotes them directly.
05High production isn't required — auto-generated captions work as long as the audio is clear.

Source: RESONEO AI Overviews citation analysis 2026

T2.6

Get a Wikipedia entry — or sourced mentions in related ones

Parametric visibility: models learn from Wikipedia

WhyModels like ChatGPT and Claude were trained heavily on Wikipedia, news archives and academic papers. That training builds their parametric layer — what they "remember" without searching the web. If your brand isn't in Wikipedia or related sources, you exist only when the model searches in real time, not when it draws on memory.

How

01If you genuinely qualify for an entry under Wikipedia's notability rules, create one — by the rules.
02If you don't qualify, target sourced mentions in articles about your topic.
03A Wikidata entry is easier to start with — its Q-identifier feeds the Knowledge Graph.
04Don't write your own article — Wikipedia has strict conflict-of-interest rules. Earn journalist coverage that leads to natural mentions instead.

Source: Anthropic + OpenAI training data composition

T2.7

Get into Google's Knowledge Graph (an MID identifier)

Critical for Google AI Mode

WhyGoogle AI Mode draws from the real Knowledge Graph, where every entity has a Machine Identifier (MID). Without an MID, you don't exist in those answers. AI Overviews is more forgiving, but AI Mode isn't. The path in is consistent name and address signals across the web, plus a Wikidata entry that ties them together.

How

01Add Organization schema on your homepage (see Tier 3).
02Use a consistent name, address and phone number everywhere — Google Business Profile, social profiles, footer.
03Create a Wikidata entry with sameAs links to all your canonical profiles.
04Encourage branded searches in Google ("Haide Digital") — the volume signal helps trigger a Knowledge Panel.
05Aim for the panel: branded image search results, the brand name on social profiles, structured data that lines up.

Source: Google Search Liaison + RESONEO AI Mode analysis

T2.8

Earn citations from authoritative third-party publications

3× the weight of content on your own domain

WhyIndustry publications, analyst reports, and professional association mentions carry disproportionate weight for authority. One Reuters article counts more than fifty blog posts on your own site. The leverage compounds because models index these sources permanently and re-encounter them across queries.

How

01Digital PR — real newsworthy stories, not press releases.
025–10 targeted placements a year is a realistic baseline for a small business.
03HARO, Qwoted, SourceBottle for expert-quote opportunities.
04Guest posts only on authority-rich domains, never content farms.
05Inclusion in analyst reports (Forrester, Gartner) — bigger lift, bigger payoff.

Source: Indig third-party authority study

T2.9

Use one consistent spelling of your brand name everywhere

Inconsistent naming fragments entity recognition

WhyWhen AI sees "Haide Digital", "haide.digital", "Haide", and "Хайде Диджитал", it treats them as different entities — or as one entity with low confidence. Pick one canonical form and use it everywhere. Confidence directly affects whether the model risks naming you in an answer.

How

01Pick ONE canonical spelling. Document it.
02Sweep every surface: Trustpilot, G2, LinkedIn, Twitter, podcast bios, conference materials, footer.
03For multilingual sites — decide which script is primary, which is secondary.
04Quarterly review — every new mention is a chance for a new variant to slip in.

Source: Adrian Nikolov — entity-consistency observations

T2.10

Maintain a real founder LinkedIn profile

E-E-A-T signal — who's behind the company

WhyAI models check who writes the content. An active founder LinkedIn with genuine expertise becomes a sourced E-E-A-T signal — Google's framework for Experience, Expertise, Authoritativeness, and Trust — that strengthens every claim made on the company domain. Empty or auto-posted profiles don't help.

How

01500+ connections, real photo, headline that positions the expertise.
02Posts 2–3 times a week with substantive opinion (not engagement bait).
03About section with a specific expert background — years, companies, results.
04sameAs schema from your website to the LinkedIn profile (Person schema on the /about page).

Source: Google E-E-A-T documentation + author entity research

III

Crawlability and machine-readability

Tier 3 · 0/10 done

Whether AI crawlers can reach the page and parse what they find. Cloudflare blocks the major bots by default since July 2025. Schema declares who you are. Speed and rendering decide whether the page is even read.

T3.1Critical

Check your Cloudflare AI bot settings (this is the trap)

Blocked by default since July 2025

WhyCloudflare turned on AI-crawler blocking by default for all new domains in July 2025. About a fifth of the web sits behind Cloudflare. Thousands of sites are silently invisible to AI crawlers and don't realise it. This is the single most common reason a site has no AI visibility despite good content.

How

01Cloudflare Dashboard → Security → Bots → "AI Scrapers and Crawlers".
02If it says Block — make a deliberate choice: pure visibility (allow all) or selective control.
03For eCommerce and SaaS with a public catalog: allow all.
04For paid/membership/proprietary content: allow OAI-SearchBot, ClaudeBot and Perplexity-User for discovery; block GPTBot and ClaudeBot crawler for training.
05After the change — check server logs over the next 7 days for real crawler traffic.

Source: Cloudflare AI Audit launch — July 2025

T3.2

Allow the major AI bots in robots.txt

A blocked bot means zero visibility on its platform

WhyPerfect content is wasted if the crawler gets a 403 response. The list of AI user agents is long and growing, but there's a stable core. robots.txt is the first thing every crawler checks — get this wrong once and you're invisible until you fix it.

How

01Allow in robots.txt: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended, CCBot, Amazonbot, Bytespider.
02Default policy: User-agent: * | Allow: /
03Specific overrides only for genuine security needs — admin paths, API endpoints.
04Test it: curl -A "GPTBot" https://yourdomain.com/robots.txt — should return 200.

Source: OpenAI / Anthropic / Google AI bot documentation

T3.3

Always allow OAI-SearchBot

Different from GPTBot — this is live ChatGPT search

WhyGPTBot crawls for OpenAI's training. OAI-SearchBot is the live crawler that fetches pages when ChatGPT searches the web in real time. They're different bots. If you block GPTBot to opt out of training, you absolutely have to allow OAI-SearchBot, or you disappear from ChatGPT search entirely.

How

01Add explicitly to robots.txt: User-agent: OAI-SearchBot | Allow: /
02Same for: ChatGPT-User (when a user asks via ChatGPT), Perplexity-User, Claude-User.
03In Cloudflare bot management — if you're running selective rules, these belong in the "allow" group.

Source: OpenAI robots policy — chatgpt.com/bot

T3.4

Get First Contentful Paint under 1 second

3.2× more citations for fast sites

WhyFirst Contentful Paint (FCP) is the time it takes for the first piece of content to appear after a page request. Sites with FCP under 0.4s average 6.7 AI citations a month. Sites above 1.13s average 2.1. Speed humans don't notice (between 0.4 and 1.1 seconds) is exactly what AI crawlers measure.

How

01PageSpeed Insights → check FCP for mobile and desktop separately.
02Above 1s = a problem. Above 1.5s = critical.
03Quick wins: image optimisation (WebP/AVIF), inline critical CSS, preload fonts, edge caching.
04Server response under 200ms — TTFB (Time to First Byte) is the foundation for FCP.
05Re-test monthly. Performance regressions creep in quietly.

Source: Indig speed-citation correlation study

T3.5

Render content on the server (or as static HTML)

Most AI crawlers don't run JavaScript

WhyIf your page is built entirely in the browser using JavaScript (a single-page app), AI crawlers see an empty shell. GPTBot, ClaudeBot and most others fetch the raw HTML and parse it — they don't run JavaScript. Server-side rendering (SSR) or static-site generation (SSG) puts the content directly into the HTML the bot sees.

How

01Test: View Source (not DevTools Inspector) — confirm the content is in the HTML.
02If the HTML is just an empty div with JS, you need to move to SSR or SSG.
03Next.js App Router with React Server Components, Astro, SvelteKit — all do SSR/SSG by default.
04If a React SPA is locked in, at least pre-render the critical routes.
05Hydration isn't the issue — what matters is that the initial HTML payload contains the content.

Source: OpenAI GPTBot specs + Common Crawl behaviour analysis

T3.6

Add Organization schema on your homepage

Foundational entity declaration

WhySchema markup is structured JSON in the page's <head> that tells search engines and AI exactly what kind of entity you are. Without Organization schema, you leave it to the AI to guess. With it, you declare the answer in a machine-readable format the model trusts more than prose.

How

01JSON-LD inside the <head> of your homepage.
02Minimum fields: name, url, logo, sameAs (all social profiles), foundingDate, founder (Person schema), contactPoint.
03sameAs links to LinkedIn, Twitter, GitHub, Wikidata — this is what stitches you into the Knowledge Graph.
04Validate with Schema Markup Validator (validator.schema.org) and Google's Rich Results Test.

Source: Schema.org + Google structured data documentation

T3.7

Add Product or Service schema on the relevant pages

73% higher chance of being selected by AI

WhyContent with the right schema markup is 73% more likely to be picked up in AI answers. Product pages without Product schema lose to competitors who add it — even when the competitor's content is weaker. The cost of adding schema is small; the cost of leaving it out is invisible until you measure it.

How

01Product schema on product pages with offers (price, currency, availability), aggregateRating, brand.
02Service schema on service pages with serviceType, provider, areaServed.
03Link Service schema back to your Organization entity.
04Don't fake ratings — Google and the models detect it and downrank.

Source: Indig schema correlation analysis

T3.8

Add FAQPage schema on real Q&A sections

Direct Q&A consumption by AI

WhyFAQPage schema delivers question-and-answer pairs in a format AI consumes directly. Particularly strong for Google AI Overviews and Perplexity. The catch: only use it on pages with real FAQ sections. Stuffing FAQ schema onto pages that don't have visible Q&A triggers manual penalties.

How

01FAQPage schema ONLY on pages with real, visible Q&A sections.
025–10 Q&A pairs per page is the sweet spot.
03Questions should be ones real users ask (check Google's "People Also Ask" for your topic).
04Answers follow the answer-capsule rules from Tier 1 — 40–60 words, direct, no inline links.

Source: Schema.org + Google AI Overviews behaviour

T3.9

Add Article and Author schema to every blog post

E-E-A-T signal at the content level

WhyAI models want to know who's writing. Article schema with a linked Person schema gives an explicit author entity, which the model cross-checks against third-party signals (LinkedIn, Wikipedia, conference speaker lists). An anonymous post with no author entity is treated with much lower confidence than a sourced one.

How

01Article (or BlogPosting) schema on every blog post.
02author = Person schema with a sameAs link to the author's LinkedIn profile.
03Both datePublished and dateModified — both matter (freshness + originality).
04publisher = a reference to your Organization schema.
05wordCount, headline, image — not required, but they add completeness.

Source: Schema.org Article spec + Google E-E-A-T guidelines

T3.10

Update content regularly and signal it (lastModified)

1.9× more likely to be cited if updated within 60 days

WhyPerplexity in particular is freshness-obsessed, but every system has a recency bias. A guide written in 2024 loses to a competitor who refreshed theirs in 2026. The freshness signal has to be real — models compare versions and detect fake date updates where nothing actually changed.

How

01Make real updates. Models detect when only the date changed.
02Quarterly review of your top 20 pages: data points, screenshots, statistics.
03Sitemap.xml lastmod has to reflect real updates.
04Set both datePublished and dateModified in Article schema — both fields matter.
05For evergreen content — add "Updated [month] [year]" inside the first 150 words.

Source: RESONEO Perplexity freshness analysis

Measurement and tracking

Tier 4 · 0/8 done

Google Analytics hides AI traffic. Search Console has a hidden regex filter that surfaces conversational queries. Server logs see what analytics never will. Build a baseline, watch the trend, make decisions from real numbers.

T4.1

Create an AI Search channel in GA4 (custom regex)

By default, AI traffic is invisible inside the Referral channel

WhyGA4 dumps all AI traffic into the generic Referral channel. You can't see a per-platform breakdown, you can't track growth, you can't measure ROI. The data shows AI traffic converts at 12.1% of signups while making up 0.5% of volume — without a custom channel, you'll never know that.

How

01GA4 → Admin → Channel Groups → new custom channel.
02Name: "AI Search"
03Conditions: Source matches regex
04Regex: chat\.openai\.com|chatgpt\.com|perplexity\.ai|claude\.ai|gemini\.google\.com|copilot\.microsoft\.com|bing\.com/chat|you\.com
05Apply to all reporting. After 24 hours you have a baseline.

Source: Ahrefs AI traffic conversion data + GA4 documentation

T4.2

Run server-log analysis with GoAccess (or similar)

GA4 can't see most AI bots

WhyGA4 uses JavaScript for tracking. Most AI bots don't execute JavaScript — and the ones that do, GA4 filters out as bots. The result: invisible. Server logs see every request, including the ones GA4 misses. GoAccess turns the raw log file into a readable report.

How

01Install GoAccess (free, open source): apt install goaccess.
02Pipe access.log: goaccess access.log -o report.html --log-format=COMBINED
03Or a quick CLI check: grep -E "GPTBot|ClaudeBot|PerplexityBot|OAI-SearchBot|Google-Extended" access.log | wc -l
04Weekly report — which AI bots arrive, how many pages they fetch, which URLs they hit most.
05For larger setups — Vector + Loki + Grafana, but GoAccess is enough for most sites.

Source: Adrian Nikolov server-log analysis methodology

T4.3

Use a regex filter in Google Search Console to surface AI queries

A hidden Search Console feature

WhySearch queries with 10+ words are almost certainly redirected from an AI chat — humans don't type sentences that long into Google. Search Console records them, but without a regex filter they're buried in the overall data. The filter takes 30 seconds to set up and gives you a permanent gold mine of intent data.

How

01Search Console → Performance → Search results.
02Click the Query filter → Custom (regex) → Matches regex.
03Regex: ^(?:\S+\s+){9,}\S+$
04This shows only queries with 10+ words.
05These queries reveal what AI systems interpret as user intent — pure gold for content gaps.

Source: Adrian Nikolov GSC analysis pattern

T4.4

Test parametric visibility — 5 prompts with web search OFF

What the model knows from training data

WhyAn AI model has two sources of knowledge: what it learned during training (parametric) and what it finds when it searches the web (dynamic). Parametric knowledge is frozen between training cycles. If your brand isn't in the parametric layer, you only show up when the model happens to search — which is uncertain and fragile.

How

01Use the Visibility Kit at haide.digital/tools/visibility-kit (5 ready-made prompts).
02Run them in: ChatGPT, Claude, Gemini, Perplexity, Copilot.
03Important: Web search OFF for the parametric test.
04Prompts: Awareness ("What is [brand]?"), Perception ("How is [brand] perceived?"), Competition ("[brand] vs alternatives"), Authority ("Is [brand] trustworthy?"), Recommendation ("Should I use [brand]?").
05Document the answers. Re-run quarterly.

Source: Haide Digital Parametric Visibility Kit

T4.5

Test dynamic visibility — same 5 prompts with web search ON

Strong on parametric does not guarantee strong on dynamic

WhyDynamic visibility is whether the AI finds you when it searches in real time. It's a different strength from parametric visibility — you can be strong at one and weak at the other. Comparing both shows you exactly where the gap is: missing content (dynamic gap) or missing entity recognition (parametric gap).

How

01Same 5 prompts as T4.4, but with Web search ENABLED.
02ChatGPT: explicitly toggle "Search the web".
03Claude: web search activates automatically for some queries — confirm it's on.
04Compare to the parametric answers. The differences show whether the gap is a content gap or an entity gap.
05Document: brand mentioned (Y/N), context (positive/neutral/negative), citation source.

Source: Haide Digital Dynamic Visibility Kit

T4.6

Use the RESONEO Chrome extension to monitor query fan-out

See in real time what ChatGPT is searching for you

WhyWhen ChatGPT is in thinking mode, a single user prompt becomes 20+ parallel sub-queries — that's called query fan-out. The RESONEO Chrome extension exposes those sub-queries in real time. It's the closest thing available to seeing what an AI thinks the relevant aspects of your topic actually are.

How

01Install the RESONEO Chrome extension from the Chrome Web Store.
02Open ChatGPT (logged in), enable thinking mode.
03Run a branded query for yourself or a competitor.
04The extension shows every sub-query the model is firing.
05Analyse: which aspects does the model think are relevant? What's missing from your content?

Source: RESONEO — think.resoneo.com

T4.7

Track your ranking on Brave Search (it's the proxy for Claude)

86.7% overlap between Claude citations and Brave Search results

WhyClaude (made by Anthropic) uses Brave Search infrastructure for its web searches. 86.7% of Claude citations come from the top Brave Search results. By comparison, the ChatGPT/Bing overlap is only 26.7%. If you want to know whether Claude will cite you, check Brave first.

How

01Brave Search (search.brave.com) — anonymous, no personalisation bias.
02Test: your brand + 5–10 key commercial queries.
03Not in Brave's top 10 for branded queries → Claude probably isn't citing you.
04Brave runs its own index and crawler — it doesn't reuse Google.
05Optimising for Brave is a separate workstream, but the Claude overlap makes it worth doing.

Source: Anthropic Claude search infrastructure analysis

T4.8

Track citations manually each quarter (10 queries, 4 platforms)

Paid tools are expensive and imprecise

WhyProfound, Conductor and Semrush AI tracking cost €1,000+ a month and run on sample queries that often don't reflect real user behaviour. A manual baseline on 10 key queries gives you a more accurate picture for almost no cost. After four quarters you have a real trend line — something the paid tools struggle to deliver.

How

01Document 10 top queries for the business (mix of branded, non-branded and comparison).
02Spreadsheet: Query | Date | ChatGPT result | Claude result | Perplexity result | Gemini result.
03Quarterly review: what's mentioned, what isn't, sentiment, citation sources.
04Track the delta: which promotions and demotions appear quarter over quarter.
05After 4 quarters you have a real trend the paid tools can't match.

Source: Adrian Nikolov — Haide Digital methodology

Haide DigitalHAIDE.DIGITAL · 2026

Free tool · Print edition

GEO Checklist

37 checks for AI search visibility, in 4 tiers.

Generated

18 April 2026

Your progress

0 of 37

Completion

Kevin Indig — analysis of 1.2M ChatGPT answers (Growth Memo, 2026)

RESONEO — GPT-5 and Claude reverse engineering (think.resoneo.com)

Cloudflare AI Audit (July 2025)

Adrian Nikolov — server-log analyses, Haide Digital (2025–2026)

How AI reads your page

Tier 1 · 0/9 done

The writing rules behind AI citations. What gets quoted, what gets skipped, and where on the page the quotable lines need to live. Drawn from an analysis of 1.2 million ChatGPT answers in 2026.

T1.1

Open with a 40–60 word answer in the first 150 words

Strongest single predictor of being cited

How

01Make the first paragraph a direct answer to the question the page is meant to address.
02Keep it between 40 and 60 words. Count them.
03No links inside this paragraph. AI models read inline links as the author hedging.
04No marketing adjectives — perfect, leading, world-class. Stick to facts and specifics.
05Template: "[Subject] is [definition]. [Key data point or specification]."

Source: Kevin Indig — Growth Memo 2026

T1.2

Lead with the answer, then the context (BLUF)

44.2% of citations come from the first third of the page

How

01Make the first sentence answer the question. Don't introduce a topic.
02Make the second sentence add a number, a date, or a specific.
03No narrative intros — "In today's fast-paced digital world…" is dead weight.
04Review your top 10 pages. Count how many open with a factual answer versus a context-setting paragraph.

Source: Kevin Indig — Growth Memo 2026

T1.3

Spread the facts: 44 / 31 / 25 across the page

After 90% of the page, citations drop to almost zero

How

01Don't save the best facts for the conclusion. Distribute them across the page in roughly that ratio.
02Make the conclusion quotable: end with facts, not "in summary, you should think about…"
03If you write a TL;DR, place it at the end of the first third — not after the conclusion.
04Footer content (CTAs, related articles, newsletter signups) is essentially invisible to AI. Don't put quotable facts there.

Source: Kevin Indig — Growth Memo 2026

T1.4

Phrase H2 headings as questions and echo the subject in the answer

78.4% of question-style citations come from headings

How

01Convert descriptive H2s into direct questions where the page genuinely answers them.
02Bad: H2 "The history of SEO" → P "It started in the nineties…"
03Good: H2 "When did SEO start?" → P "SEO started in…"
04The first word of the answer paragraph should be the subject from the question.
05Don't force this everywhere — only where there's a real Q-and-A pattern on the page.

Source: Kevin Indig — Growth Memo 2026

T1.5

Mention 15–20% specific names (entity density)

20.6% in cited text vs 5–8% in average copy

How

01Name specific brands, tools, standards, protocols, versions. Don't be precious about it.
02Don't be afraid to namedrop competitors — it raises your own trustworthiness with the model, even when the names belong to someone else.
03Bad: "Top tools help businesses scale."
04Good: "Top tools include Salesforce, HubSpot, and Pipedrive."
05Take a paragraph and count its proper nouns. Below 5%? Add anchors.

Source: Kevin Indig — Growth Memo 2026

T1.6

Pack the middle of every paragraph with information

53% of citations come from the middle of a paragraph

How

01Don't force the answer into the first sentence of every paragraph.
02Every sentence has to add something new. If a sentence can be removed without changing the meaning, it's filler — cut it.
03The middle sentence of a paragraph should be the one with the most names and numbers.
04Test: delete each sentence one by one. If the meaning survives, the sentence didn't earn its place.

Source: Kevin Indig — Growth Memo 2026

T1.7

Write in the voice of an analyst (subjectivity around 0.47)

Halfway between Wikipedia (0.1) and marketing hype (0.9)

How

01Template: [verifiable fact]. [analytical implication].
02Example: "iPhone 15 has the A16 chip (fact), which makes it well suited to creators shooting in low light (analysis)."
03Don't sound like Wikipedia — too dry to be quoted as interpretation.
04Don't sound like product copy — "industry-leading", "cutting-edge" gets filtered out.

Source: Kevin Indig — Growth Memo 2026

T1.8

Aim for a Flesch-Kincaid reading level around 16

16 gets cited. 19 gets skipped.

How

01Paste your top page into hemingwayapp.com or any Flesch-Kincaid checker.
02Above 18 — start cutting sentence length.
03Average sentence under 20 words. Ideal range: 12–18.
04Use complex words only when a simpler one loses meaning.
05Reference scores: The Economist ≈ 13–15. Harvard Business Review ≈ 14–16. Academic paper ≈ 18–22.

Source: Kevin Indig — Growth Memo 2026

T1.9

End with facts, not opinions or CTAs

24.7% of citations come from the last third

How

01End the page with a synthesis of the key facts from the body.
02Avoid CTAs, soft summaries, or generic 'you decide' statements in the closing paragraph.
03Template: "In summary, [key fact 1], [key fact 2], and [data point]."
04Place the CTA in a separate block below the conclusion, not inside it.

Source: Kevin Indig — Growth Memo 2026

Third-party trust signals

Tier 2 · 0/10 done

Outside voices count more than your own. Reviews, comparisons, YouTube, Reddit, the Knowledge Graph. Without these, an AI model has no record that you exist beyond your own marketing pages.

T2.1

Build a comparison page for every named competitor (X vs Y)

GPT-5.4 thinking mode literally runs site:yourdomain.com [competitor]

How

01Type "[your product] vs" into Google and look at autocomplete.
02The first 5–10 suggestions are your real competitors based on actual searches.
03Create a separate page for each. Not a "Top alternatives" page — one page per matchup.
04URL pattern: /[product]-vs-[competitor]
05The page itself follows the Tier 1 rules: answer capsule, BLUF, comparison table.

Source: RESONEO + Adrian Nikolov field testing

T2.2

Show real prices on the page (no "Contact Sales")

35× higher chance of pricing-page citation

How

01Exact pricing isn't required. Ranges work: "Starting at €99/month."
02Tiered pricing with brief feature lists works extremely well.
03For an enterprise tier with custom pricing — show it as the top tier with "Custom" and ranges for the others.
04Marketing trade-off: you give up some lead capture, you gain AI visibility on commercial searches.

Source: RESONEO — GPT-5.4 site: query analysis

T2.3

Be present on at least three review platforms

3× citation multiplier on purchase-intent queries

How

01Aim for at least three platforms. Not one with 500 reviews.
025 on G2 + 5 on Capterra + 10 on Trustpilot beats 50 on a single site.
03Ask happy existing customers. A simple email with a direct link is enough.
04Reply to reviews — both positive and negative. Replies count as fresh signal.

Source: Indig + RESONEO citation source analysis

T2.4

Have a real, organic presence on Reddit

4× citation multiplier — the highest of any platform

How

01Don't buy mentions. Models will get more aggressive about spam patterns over time.
02Find 3–5 subreddits relevant to your space.
03Real participation only: customer support, sharing expertise, answering questions.
04Models read the threading and the upvote counts — top-voted comments in r/SaaS carry weight.

Source: Kevin Indig — citation source breakdown

T2.5

Publish at least 5–10 product videos on YouTube

#1 cited domain in Google's AI Overviews

How

01Aim for at least 5–10 videos covering your product or service.
02Title each video with a specific question (entity echoing works on YouTube too).
03Put a full transcript or a detailed summary with proper names in the description.
04Add chapters and timestamps — AI Overviews quotes them directly.
05High production isn't required — auto-generated captions work as long as the audio is clear.

Source: RESONEO AI Overviews citation analysis 2026

T2.6

Get a Wikipedia entry — or sourced mentions in related ones

Parametric visibility: models learn from Wikipedia

How

01If you genuinely qualify for an entry under Wikipedia's notability rules, create one — by the rules.
02If you don't qualify, target sourced mentions in articles about your topic.
03A Wikidata entry is easier to start with — its Q-identifier feeds the Knowledge Graph.
04Don't write your own article — Wikipedia has strict conflict-of-interest rules. Earn journalist coverage that leads to natural mentions instead.

Source: Anthropic + OpenAI training data composition

T2.7

Get into Google's Knowledge Graph (an MID identifier)

Critical for Google AI Mode

How

01Add Organization schema on your homepage (see Tier 3).
02Use a consistent name, address and phone number everywhere — Google Business Profile, social profiles, footer.
03Create a Wikidata entry with sameAs links to all your canonical profiles.
04Encourage branded searches in Google ("Haide Digital") — the volume signal helps trigger a Knowledge Panel.
05Aim for the panel: branded image search results, the brand name on social profiles, structured data that lines up.

Source: Google Search Liaison + RESONEO AI Mode analysis

T2.8

Earn citations from authoritative third-party publications

3× the weight of content on your own domain

How

01Digital PR — real newsworthy stories, not press releases.
025–10 targeted placements a year is a realistic baseline for a small business.
03HARO, Qwoted, SourceBottle for expert-quote opportunities.
04Guest posts only on authority-rich domains, never content farms.
05Inclusion in analyst reports (Forrester, Gartner) — bigger lift, bigger payoff.

Source: Indig third-party authority study

T2.9

Use one consistent spelling of your brand name everywhere

Inconsistent naming fragments entity recognition

How

01Pick ONE canonical spelling. Document it.
02Sweep every surface: Trustpilot, G2, LinkedIn, Twitter, podcast bios, conference materials, footer.
03For multilingual sites — decide which script is primary, which is secondary.
04Quarterly review — every new mention is a chance for a new variant to slip in.

Source: Adrian Nikolov — entity-consistency observations

T2.10

Maintain a real founder LinkedIn profile

E-E-A-T signal — who's behind the company

How

01500+ connections, real photo, headline that positions the expertise.
02Posts 2–3 times a week with substantive opinion (not engagement bait).
03About section with a specific expert background — years, companies, results.
04sameAs schema from your website to the LinkedIn profile (Person schema on the /about page).

Source: Google E-E-A-T documentation + author entity research

III

Crawlability and machine-readability

Tier 3 · 0/10 done

T3.1Critical

Check your Cloudflare AI bot settings (this is the trap)

Blocked by default since July 2025

How

01Cloudflare Dashboard → Security → Bots → "AI Scrapers and Crawlers".
02If it says Block — make a deliberate choice: pure visibility (allow all) or selective control.
03For eCommerce and SaaS with a public catalog: allow all.
04For paid/membership/proprietary content: allow OAI-SearchBot, ClaudeBot and Perplexity-User for discovery; block GPTBot and ClaudeBot crawler for training.
05After the change — check server logs over the next 7 days for real crawler traffic.

Source: Cloudflare AI Audit launch — July 2025

T3.2

Allow the major AI bots in robots.txt

A blocked bot means zero visibility on its platform

How

01Allow in robots.txt: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended, CCBot, Amazonbot, Bytespider.
02Default policy: User-agent: * | Allow: /
03Specific overrides only for genuine security needs — admin paths, API endpoints.
04Test it: curl -A "GPTBot" https://yourdomain.com/robots.txt — should return 200.

Source: OpenAI / Anthropic / Google AI bot documentation

T3.3

Always allow OAI-SearchBot

Different from GPTBot — this is live ChatGPT search

How

01Add explicitly to robots.txt: User-agent: OAI-SearchBot | Allow: /
02Same for: ChatGPT-User (when a user asks via ChatGPT), Perplexity-User, Claude-User.
03In Cloudflare bot management — if you're running selective rules, these belong in the "allow" group.

Source: OpenAI robots policy — chatgpt.com/bot

T3.4

Get First Contentful Paint under 1 second

3.2× more citations for fast sites

How

01PageSpeed Insights → check FCP for mobile and desktop separately.
02Above 1s = a problem. Above 1.5s = critical.
03Quick wins: image optimisation (WebP/AVIF), inline critical CSS, preload fonts, edge caching.
04Server response under 200ms — TTFB (Time to First Byte) is the foundation for FCP.
05Re-test monthly. Performance regressions creep in quietly.

Source: Indig speed-citation correlation study

T3.5

Render content on the server (or as static HTML)

Most AI crawlers don't run JavaScript

How

01Test: View Source (not DevTools Inspector) — confirm the content is in the HTML.
02If the HTML is just an empty div with JS, you need to move to SSR or SSG.
03Next.js App Router with React Server Components, Astro, SvelteKit — all do SSR/SSG by default.
04If a React SPA is locked in, at least pre-render the critical routes.
05Hydration isn't the issue — what matters is that the initial HTML payload contains the content.

Source: OpenAI GPTBot specs + Common Crawl behaviour analysis

T3.6

Add Organization schema on your homepage

Foundational entity declaration

How

01JSON-LD inside the <head> of your homepage.
02Minimum fields: name, url, logo, sameAs (all social profiles), foundingDate, founder (Person schema), contactPoint.
03sameAs links to LinkedIn, Twitter, GitHub, Wikidata — this is what stitches you into the Knowledge Graph.
04Validate with Schema Markup Validator (validator.schema.org) and Google's Rich Results Test.

Source: Schema.org + Google structured data documentation

T3.7

Add Product or Service schema on the relevant pages

73% higher chance of being selected by AI

How

01Product schema on product pages with offers (price, currency, availability), aggregateRating, brand.
02Service schema on service pages with serviceType, provider, areaServed.
03Link Service schema back to your Organization entity.
04Don't fake ratings — Google and the models detect it and downrank.

Source: Indig schema correlation analysis

T3.8

Add FAQPage schema on real Q&A sections

Direct Q&A consumption by AI

How

01FAQPage schema ONLY on pages with real, visible Q&A sections.
025–10 Q&A pairs per page is the sweet spot.
03Questions should be ones real users ask (check Google's "People Also Ask" for your topic).
04Answers follow the answer-capsule rules from Tier 1 — 40–60 words, direct, no inline links.

Source: Schema.org + Google AI Overviews behaviour

T3.9

Add Article and Author schema to every blog post

E-E-A-T signal at the content level

How

01Article (or BlogPosting) schema on every blog post.
02author = Person schema with a sameAs link to the author's LinkedIn profile.
03Both datePublished and dateModified — both matter (freshness + originality).
04publisher = a reference to your Organization schema.
05wordCount, headline, image — not required, but they add completeness.

Source: Schema.org Article spec + Google E-E-A-T guidelines

T3.10

Update content regularly and signal it (lastModified)

1.9× more likely to be cited if updated within 60 days

How

01Make real updates. Models detect when only the date changed.
02Quarterly review of your top 20 pages: data points, screenshots, statistics.
03Sitemap.xml lastmod has to reflect real updates.
04Set both datePublished and dateModified in Article schema — both fields matter.
05For evergreen content — add "Updated [month] [year]" inside the first 150 words.

Source: RESONEO Perplexity freshness analysis

Measurement and tracking

Tier 4 · 0/8 done

T4.1

Create an AI Search channel in GA4 (custom regex)

By default, AI traffic is invisible inside the Referral channel

How

01GA4 → Admin → Channel Groups → new custom channel.
02Name: "AI Search"
03Conditions: Source matches regex
04Regex: chat\.openai\.com|chatgpt\.com|perplexity\.ai|claude\.ai|gemini\.google\.com|copilot\.microsoft\.com|bing\.com/chat|you\.com
05Apply to all reporting. After 24 hours you have a baseline.

Source: Ahrefs AI traffic conversion data + GA4 documentation

T4.2

Run server-log analysis with GoAccess (or similar)

GA4 can't see most AI bots

How

01Install GoAccess (free, open source): apt install goaccess.
02Pipe access.log: goaccess access.log -o report.html --log-format=COMBINED
03Or a quick CLI check: grep -E "GPTBot|ClaudeBot|PerplexityBot|OAI-SearchBot|Google-Extended" access.log | wc -l
04Weekly report — which AI bots arrive, how many pages they fetch, which URLs they hit most.
05For larger setups — Vector + Loki + Grafana, but GoAccess is enough for most sites.

Source: Adrian Nikolov server-log analysis methodology

T4.3

Use a regex filter in Google Search Console to surface AI queries

A hidden Search Console feature

How

01Search Console → Performance → Search results.
02Click the Query filter → Custom (regex) → Matches regex.
03Regex: ^(?:\S+\s+){9,}\S+$
04This shows only queries with 10+ words.
05These queries reveal what AI systems interpret as user intent — pure gold for content gaps.

Source: Adrian Nikolov GSC analysis pattern

T4.4

Test parametric visibility — 5 prompts with web search OFF

What the model knows from training data

How

01Use the Visibility Kit at haide.digital/tools/visibility-kit (5 ready-made prompts).
02Run them in: ChatGPT, Claude, Gemini, Perplexity, Copilot.
03Important: Web search OFF for the parametric test.
04Prompts: Awareness ("What is [brand]?"), Perception ("How is [brand] perceived?"), Competition ("[brand] vs alternatives"), Authority ("Is [brand] trustworthy?"), Recommendation ("Should I use [brand]?").
05Document the answers. Re-run quarterly.

Source: Haide Digital Parametric Visibility Kit

T4.5

Test dynamic visibility — same 5 prompts with web search ON

Strong on parametric does not guarantee strong on dynamic

How

01Same 5 prompts as T4.4, but with Web search ENABLED.
02ChatGPT: explicitly toggle "Search the web".
03Claude: web search activates automatically for some queries — confirm it's on.
04Compare to the parametric answers. The differences show whether the gap is a content gap or an entity gap.
05Document: brand mentioned (Y/N), context (positive/neutral/negative), citation source.

Source: Haide Digital Dynamic Visibility Kit

T4.6

Use the RESONEO Chrome extension to monitor query fan-out

See in real time what ChatGPT is searching for you

How

01Install the RESONEO Chrome extension from the Chrome Web Store.
02Open ChatGPT (logged in), enable thinking mode.
03Run a branded query for yourself or a competitor.
04The extension shows every sub-query the model is firing.
05Analyse: which aspects does the model think are relevant? What's missing from your content?

Source: RESONEO — think.resoneo.com

T4.7

Track your ranking on Brave Search (it's the proxy for Claude)

86.7% overlap between Claude citations and Brave Search results

How

01Brave Search (search.brave.com) — anonymous, no personalisation bias.
02Test: your brand + 5–10 key commercial queries.
03Not in Brave's top 10 for branded queries → Claude probably isn't citing you.
04Brave runs its own index and crawler — it doesn't reuse Google.
05Optimising for Brave is a separate workstream, but the Claude overlap makes it worth doing.

Source: Anthropic Claude search infrastructure analysis

T4.8

Track citations manually each quarter (10 queries, 4 platforms)

Paid tools are expensive and imprecise

How

01Document 10 top queries for the business (mix of branded, non-branded and comparison).
02Spreadsheet: Query | Date | ChatGPT result | Claude result | Perplexity result | Gemini result.
03Quarterly review: what's mentioned, what isn't, sentiment, citation sources.
04Track the delta: which promotions and demotions appear quarter over quarter.
05After 4 quarters you have a real trend the paid tools can't match.

Source: Adrian Nikolov — Haide Digital methodology