Research · 2026-05-14 · Updated 2026-07-22

LLM ranking factors: why ‘ranking’ is the wrong frame, and what correlates with AI citations.

There is no such thing as an LLM ranking factor. But there is a data-backed answer, and it starts with the largest public dataset on what correlates with AI recommendations: 100 industries, 1,100 buyer personas, 18 signals, 403,000 prompts across 12 models, read against four years of our own field work. AI systems are not ranking engines, so source selection happens server-side and is not directly observable, and the signals that do correlate swing hard by vertical, which is where generic advice misses the work.

TL;DR

No single signal dominates. The strongest pooled correlation explains 11% of variance. The single best signal-and-vertical pairing in the entire dataset, Search Engine Appearances in auto insurance, reaches 39%, and just 8 of the 1,800 pairings clear 30%. Anyone selling a universal AI-visibility lever is selling theatre.
Citations come from two layers. Getting surfaced (the off-page signals below, calibrated by vertical) and being extractable once surfaced (on-page passage shape and preview control). This dataset measures the first. The second is where a page that ranks #1 still loses the citation.
SERP-derived signals correlate most consistently. Search Engine Appearances, Best Rank, and SE Outbound Links cluster in most verticals. Check the sample before leaning on any of them: the two rank signals sit on a median of 110 domains per industry against roughly 1,344 tracked, so they are measured on an already-filtered population. SE Outbound Links is the better-sampled of the group at a median of 498.
The pooled table understates everything. Averaging across 100 different industries dilutes every signal. Wikipedia Citations reads +0.055 pooled, the weakest of the eighteen, yet its median across industries is +0.315 and it is Dominant in 57 of them. Read your vertical, never the headline row.
Wikidata is category-dependent. +0.625 in cruises, +0.020 in renters insurance. Same work, thirty-fold difference in what it is worth.
One signal is a near-universal starting point. SE Outbound Links leads 45 of the 100 industries and is Dominant in 87. It counts how often the pages already ranking for your category link out to you, which is a different job from collecting backlinks. A universal starting point is not a universal lever: even here the correlation tops out at 31% of variance in one vertical and explains 11% pooled. It tells you where to begin, not what will work.
Homepage vocabulary is the second-strongest signal. ρ=+0.204 pooled, and Dominant in verticals like payroll (+0.408) and health insurance (+0.400). Not keyword density: what is measured is whether a homepage carries the vocabulary its industry and its buyers actually use, weighted by placement.

The dataset samples twelve models across five families: Claude, GPT, Gemini, DeepSeek, and GLM. It does not cover Perplexity or Google AI Overviews, whose behaviour we measure separately, and it scores recommendation rather than citation. Those are related but not identical outcomes.

The eighteen signals

OppAlerts correlated model recommendations against 18 web signals using Spearman rank correlation, across 15B+ crawled pages, 282B+ links, 26B+ Reddit comments, and 150,000 organic searches. The pooled baseline across all 100 industries looks like this.

Pooled across all 100 industries, July 2026 edition.
Signal	Spearman ρ	R²	n	Tier
SE Outbound Links	+0.331	11.0%	40,733	Dominant
Homepage Keywords	+0.204	4.2%	41,494	Strong
Domain PageRank	+0.192	3.7%	97,628	Confirmed
PageRank History	+0.183	3.3%	98,546	Confirmed
Search Engine Appearances	+0.165	2.7%	9,323	Confirmed
Common Crawl	+0.165	2.7%	62,178	Confirmed
Host Harmonic Centrality	+0.164	2.7%	97,280	Confirmed
Domain Backlinks	+0.160	2.6%	74,885	Confirmed
Host PageRank	+0.156	2.4%	97,280	Confirmed
Harmonic Centrality History	+0.153	2.4%	98,546	Confirmed
Domain Harmonic Centrality	+0.151	2.3%	97,628	Confirmed
Wikidata	+0.151	2.3%	12,300	Confirmed
Host Backlinks	+0.150	2.2%	65,691	Confirmed
Best Search Engine Rank	+0.148	2.2%	9,323	Confirmed
Reddit Comments	+0.148	2.2%	48,298	Confirmed
Reddit Posts	+0.128	1.6%	41,951	Confirmed
Avg Search Engine Rank	+0.077	0.6%	9,323	Emerging
Wikipedia Citations	+0.055	0.3%	24,499	Emerging

R² here is simply ρ², so each row is a single-signal rank correlation and the column does not add up. The seven backlink rows are heavily collinear: they describe largely the same underlying thing measured seven ways, not seven independent findings. Read the n column before trusting any row. Signals only present for large brands, like Wikidata at n=12,300 against Domain PageRank at n=97,628, are measured on an already-filtered population.

Two things are missing from the test list: schema/JSON-LD and llms.txt. Neither was measured here, and not measuring something is not evidence against it, so this dataset says nothing either way. What settles it is independent causal work. Ahrefs ran 1,885 pages that added JSON-LD against 4,000 matched controls from Aug 2025 to Mar 2026: null effect on ChatGPT and AI Mode, and a small negative effect on Google AI Overviews. Five major LLMs ignored JSON-LD entirely during retrieval. Schema still earns its keep for classic Google rich results, but it does not move AI citations.

How to read that table

The pooled numbers are the most diluted version of the data

We pulled all 100 industry profiles and compared each signal's pooled figure against its median across the individual industries. Every signal scores higher per-industry than pooled, several of them dramatically. This is ordinary statistics: averaging one correlation across 100 heterogeneous categories washes it out. The consequence is that quoting the headline table, which is what a write-up usually does, understates every signal on the board.

Pooled figure against the median of the 100 individual industries, July 2026 edition. “Dominant” counts use the source's own tier label.
Signal	Pooled ρ	Median by industry	Dominant in
Wikipedia Citations	+0.055	+0.315	57 / 100
Search Engine Appearances	+0.165	+0.344	67 / 100
Reddit Comments	+0.148	+0.342	67 / 100
Common Crawl	+0.165	+0.353	76 / 100
Homepage Keywords	+0.204	+0.296	48 / 100
Wikidata	+0.151	+0.250	26 / 100
SE Outbound Links	+0.331	+0.413	87 / 100

Wikipedia Citations is the clearest case. Pooled, it is the weakest signal of the eighteen and easy to dismiss. Read per-industry, it is Dominant in 57 of 100 and tops three of them outright. Quote the pooled row and you write off a signal that leads more than half the board.

The second layer

Being quotable once you are surfaced

The eighteen signals above answer one question: will an AI engine find you in the index. They say nothing about the second question, which is where the “we rank but never get cited” cases we have worked on were decided. Once the engine has your page open, can it lift a clean answer out of it? A separate evidence-weighted meta-analysis (Zyppy / Signal, 54 experiments across ChatGPT, Gemini, and Perplexity) scores the on-page factors that decide it.

On-page factor	Weight	What it means
Preview Control (nosnippet)	9.2	A page can rank #1 and still be uncitable if a nosnippet or max-snippet directive tells engines not to quote it.
Query-Answer Match	9.2	Titles and subheads carry the literal query, not a creative variant.
Intent-Format Match	9.0	Listicle for "best," steps for "how-to," table for "vs," definition for "what is."
Answer Near the Top	8.8	Gemini caps how much text it reads per URL; an answer below that window is invisible.
AI-ready Structure	8.6	Clear headings, sections, and tables the model can extract without guessing.
Self-Contained Passages	8.0	Each load-bearing fact stated fully inside its own sentence or block.

Weights are evidence scores out of 10, ranking how repeatable and well-supported each factor is across the underlying studies. They are not promises.

The check we run first

Preview Control scores 9.2, and it is the one we most often find unchecked. A nosnippet or max-snippet:0 directive, often left in place by a plugin default or a legacy SEO config, lets a page rank first and still forbids any engine from quoting it. The page is crawlable, well written, and invisible to AI at the same time. It takes seconds to verify and minutes to fix, so it is the first thing we read on any domain where the rankings are strong but the citations are not.

This layer also agrees on what does not work. Structured data lands near the bottom here (#20 of 23) and llms.txt ranks dead last (23rd, no measurable effect). The causal evidence above is harder still on schema, finding null to slightly negative. Whichever dataset you trust, neither is a lever you can lean on for AI citations.

Four patterns to calibrate by

We grouped all 100 industries by which signal actually tops each one. Four configurations account for 92 of them, and each points at a different place to spend. The counts below are exact, not illustrative.

Pattern A

Citation-graph led · 45 of 100

SE Outbound Links tops the category. Airlines (+0.552), food delivery (+0.540), homeowners insurance (+0.528), streaming (+0.519), commercial banking (+0.491), luxury fashion (+0.486). The largest group by some distance.

Posture: earn placement on the pages that already rank for the category. Roundups, comparisons, buyer guides, trade press, the resource lists that sit on page one today. The work is digital PR and editorial placement aimed at a named set of URLs you can list before you start.

Pattern B

SERP-presence led · 24 of 100

Search Engine Appearances tops the category. Auto insurance (+0.626), credit cards (+0.569), cloud infrastructure (+0.556), HR payroll software (+0.544), life insurance (+0.543), homebuilders (+0.538). Regulated and high-consideration purchases cluster here.

Posture: classic organic depth. Ranking across more of the category's query space rather than higher for a few head terms. The association is consistent with models drawing on what the index already surfaces, though the dataset cannot establish that, and the signal sits on a thin sample.

Pattern C

Community and corpus led · 14 of 100

Reddit or Common Crawl tops the category. Auto OEM brands (Common Crawl +0.546), mattress stores (Reddit Posts +0.486), sneakers (+0.460), home centers (Reddit Comments +0.451), skincare (+0.444), haircare (+0.420). Considered consumer purchases where people research in public before buying.

Posture: presence in the communities where the category is actually discussed, and broad crawlable reach. This is the hardest group to fake and the one where a US-centric read misleads most, because Reddit is not where every market talks.

Pattern D

Entity led · 9 of 100

Wikidata or Wikipedia Citations tops the category. Cruises (+0.625), residential real estate brokerages (+0.580), self storage (+0.559), car rental (+0.530), fitness clubs (Wikipedia +0.391), cosmetic surgery clinics (+0.335). Categories where the buyer is choosing between named, recognisable operators.

Posture: structured entity disambiguation, notability, consistent identity across the reference layer. Worth leading with here and close to worthless in renters insurance (+0.020) or corporate tax advisory (+0.053), which is the whole argument for checking the vertical before writing the plan.

The remaining 8

Homepage vocabulary led

Data center colocation (+0.491), healthcare practice management software (+0.477), credit monitoring (+0.462), personal loan fintech (+0.460), baby care (+0.445). Technical and financial categories with precise, shared vocabulary that a buyer expects to see stated plainly on the homepage.

Posture: say what you do in the words the category uses, in the title, the description, and the body. The homepage is the load-bearing page here, not an afterthought.

Where generic advice does the most damage

Wikidata is the most variable signal in the dataset. It is a lead move in some categories and close to irrelevant in others, which is where generic AI-visibility recommendations fall apart. Same advice, thirty-fold difference in what it is worth.

Cruises+0.625
Residential real estate brokerages+0.580
Self storage brands+0.559
Car rental brands+0.530
Food delivery platforms+0.512
median of all 100+0.250
Med spas & aesthetic clinics+0.078
Consumer banking+0.077
E-signature & document workflow+0.062
Corporate tax advisory+0.053
Renters insurance+0.020

Cruises sit at +0.625, where Wikidata is the single strongest signal in the category and entity work leads the plan. Renters insurance sits at +0.020, where the same work is a rounding error and the budget belongs elsewhere. Reading the pooled +0.151 and applying it everywhere gets both wrong. The median industry is +0.250, which describes almost none of them well.

How we use this

Six moves we run on every opportunity review where AI visibility is in scope.

01
Look up the vertical first.
Before any recommendation, we match the client's industry to the 100 OppAlerts profiles and pull the top three signals for that vertical. Skipping this step is the failure mode. Where no profile is a clean match, we say so rather than forcing the nearest one.
02
Lead with SERP work when in doubt.
SE Outbound Links is the strongest pooled signal at ρ=+0.331, and it is not a backlink count. It measures how often the pages that already rank for your category link out to you. When a new client has no clean vertical match, earning placement on those pages is the default opening move.
03
Check the vertical's entity weight, and the sample behind it.
Wikidata ranges from ρ=+0.625 in cruises to +0.020 in renters insurance, a thirty-fold spread that decides whether entity work is a lead move or an afterthought. We also read the n behind it, because entity signals are recorded for a small fraction of domains, a median of 196 in an industry tracking 1,344, and thin samples produce dramatic numbers that do not survive more data.
04
Treat Harmonic Centrality as its own signal.
Where you sit in the link graph carries information that raw link count misses, especially in trust-critical verticals like insurance, finance, and healthcare. The work is earning placement alongside category leaders.
05
Measure across every LLM that matters.
The dataset now spans twelve models, but it measures whether a domain gets recommended, not whether your specific pages get cited, and it does not cover Perplexity or AI Overviews. Our own multi-LLM citation test runs across ChatGPT, Claude, and Perplexity at engagement start and again every quarter, so the trajectory we report is the one the client is actually on.
06
Read the extractability layer in parallel.
Before any off-page work compounds, we confirm the pages can even be quoted: preview-control directives, answer position, and format-to-intent match. A domain that ranks well and cannot be quoted takes seconds to detect and minutes to fix, so it goes first.

What we won't sell you

Three things sold as AI-visibility levers that do not hold up as levers. None of them gets priced into a proposal on that basis.

Schema / JSON-LD as an AI-visibility lever

Ahrefs ran a causal study from Aug 2025 to Mar 2026: 1,885 pages adding JSON-LD against 4,000 matched controls. Null effect on ChatGPT, null on Google AI Mode, and a small negative effect on AI Overviews. Five major LLMs extracted only visible HTML during retrieval. Schema still earns its keep for classic Google rich results, but it does nothing for AI citations.

`llms.txt`

As a citation lever, it does not hold up. The evidence-weighted meta-analysis below ranks it last of 23 factors with no measurable effect on citations, and an independent teardown of ChatGPT's retrieval found it in none of the fetch pipelines. Publish one if you want a tidy, accurate index of your own site, which is why we publish ours. Just never buy it as a way to get cited, and never let anyone price it as one.

A single number that predicts your AI visibility

The strongest signal here explains 11% of the variance in whether a model recommends you. The other 89% is not in this dataset, or any other public one. Every vendor score that claims to tell you how visible you are, or where that score will be in six months, is inventing the confident part. We do not publish score forecasts, and we will not sell you one.

Want to see where your domain actually stands?

Every opportunity review includes a multi-LLM citation test across ChatGPT, Claude, and Perplexity, scored against the OppAlerts baseline for your industry. You see where you are getting cited, where competitors are getting cited in your place, and what to change first.

Request an opportunity review →

Sources

OppAlerts AI Search Visibility & Ranking Factors: 100 industries, 1,100 personas, 18 signals, 403,000 prompts across 12 models (July 2026 edition)All 100 industry profiles read for this analysis, not the headline table alone.
Zyppy / Signal: AI Citation Ranking Factors (Cyrus Shepard, 2026; 54 experiments, 23 evidence-weighted factors)
Ahrefs JSON-LD causal study (Aug 2025 to Mar 2026, 1,885 pages vs. 4,000 controls)
Suganthan Mohanadasan, “How ChatGPT Actually Picks Sources (I Read the Network Traffic, Not the Outputs)” (24 June 2026): ChatGPT retrieval reverse-engineered from raw network traffic, identifying four fetch pipelinesThe author is explicit that the structural findings (pipelines, mechanisms) are the high-confidence half, while the frequency percentages are directional only. We use the structure, not the percentages. His follow-up of 14 July 2026 records the pipeline mix shifting between runs, a reminder to read the current edition of any source and treat a single snapshot as a snapshot.