Causal study · Primary source · 2026

Schema doesn't lift AI citations. Here's the experiment that proved it.

Ahrefs ran the only controlled study comparing pages that added JSON-LD schema against pages that didn't. Over seven months and three AI surfaces, the causal effect was flat or negative. This is what the methodology, numbers and caveats actually say.

PublishedMay 16, 202614 min read

Sample

1,885

treated pages that added JSON-LD schema between August 2025 and March 2026

Ahrefs · diff-in-diff causal study

4,000

matched control pages with similar pre-treatment citation levels

Same study · matched controls

−4.6%

Google AI Overviews citations after adding schema (statistically significant)

Ahrefs · 2026

+2.4%

AI Mode citations — within the noise floor, not significant

Same study

Chapter 01

Why this study matters more than every other schema claim

For the last two years the SEO industry has been making two contradictory claims about schema and AI search. The first: "AI engines love structured data, add as much as you can." The second: "Schema doesn't matter for LLMs, content is all that matters." Both camps were arguing from intuition, not data.

In 2026 Ahrefs published the first causal study on the question. The methodology matters because the noise is enormous: pages that have schema also tend to have better SEO, more authority, faster sites, and cleaner content. Correlation is trivial. Causation is the only number worth quoting.

The team identified 1,885 pages that added JSON-LD schema between August 2025 and March 2026, and matched them against 4,000 control pages with similar pre-treatment citation rates. They then measured citation changes across Google AI Overviews, Google AI Mode, and ChatGPT using a difference-in-differences design and four separate statistical tests.

Pre-treatment, the correlation looked like a slam dunk. Cited pages were nearly 3× more likely to have JSON-LD than uncited pages. If you stopped there — as most blog posts do — you would walk away convinced schema is a citation lever. The controlled experiment is what made the conclusion the opposite.

Layer	Value
Initial dataset analyzed for correlation	6,000,000 URLs
Pages that added JSON-LD (treated)— Across 7 months	1,885
Matched control pages— Similar pre-treatment levels	4,000
Statistical tests applied	4 independent tests
AI surfaces measured	AI Overviews + AI Mode + ChatGPT

3×

the raw correlation everyone quotes — and that the controlled experiment showed was driven by confounders, not by schema itself

Ahrefs · pre-treatment analysis

Technically sophisticated sites have schema AND citations. The shared cause is technical SEO maturity. Stripping that confounder out is the only thing that gives you a causal answer.

Chapter 02

The numbers, surface by surface

The headline finding is that adding JSON-LD schema produced no major lift on any AI surface measured. On AI Mode the effect was +2.4%, within the noise floor and statistically not significant. On ChatGPT the effect was +2.2%, also not significant. On Google AI Overviews the effect was −4.6%, modest in absolute terms but statistically significant in the direction opposite to what the industry assumed.

Reading the −4.6% on AI Overviews requires care. It does not mean schema actively hurts you on average. It means that, among pages already being cited heavily, adding schema did not move them up and may have shifted some out of a particular surface. The study authors flag this as worth investigating, not as a finding to over-extrapolate.

The +2.4% / +2.2% on AI Mode and ChatGPT are easier to read. They are too small and too noisy to call a real effect. In practical terms, schema produced no measurable advantage on the two surfaces with the most aggressive citation behavior.

Surface	Effect of adding schema	Significance
Google AI Overviews	−4.6%	Significant (modest)
Google AI Mode	+2.4%	Not significant
ChatGPT Search	+2.2%	Not significant

AI surfaces where the causal effect of adding schema was both large and statistically significant in the positive direction

Same study · across 4 tests

Modest

the only significant signal was negative, on AI Overviews — and it was modest in absolute terms

Same study

Caveat

Where schema may still matter

The study's authors are explicit: these results apply only to pages already being cited heavily by AI. Schema may still help discoverability of pages that have never been cited — the experiment cannot answer that. It also pooled all JSON-LD types (Article, FAQ, Product, HowTo, Organization), so type-level effects are unknown.

Ahrefs · ahrefs.com/blog/schema-ai-citations

Chapter 03

What the methodology rules out

Diff-in-diff with matched controls is one of the cleaner research designs available outside of randomized trials. It works by comparing the change in citation rate for treated pages against the change for matched controls over the same period. If schema caused a lift, treated pages should accelerate relative to controls. They didn't.

The four independent statistical tests reduce the chance that a single quirky model produced the result. The 7-month window is long enough that any short-term volatility in AI engines should average out. The 6M URL screening reduces the risk of an unrepresentative sample.

What the design can't rule out: indirect effects (schema improving discoverability before the citation horizon), type-level differences (Product schema may behave differently than FAQ), and effects on uncited pages. These are real limitations, but they constrain the claim, not weaken it. The claim is narrow and well-supported: for already-cited pages, in the surfaces studied, schema is not a citation lever.

This is the strongest piece of evidence we have. Anything claiming the opposite has to either point to a better-designed experiment (there isn't one yet) or argue from confounded correlation (which the Ahrefs team explicitly tested and ruled out).

Question	What the study answers	What it doesn't
Does adding schema lift citations for already-cited pages?	No, on the surfaces measured	—
Does schema help uncited pages get discovered?	Out of scope	Open question
Do specific schema types behave differently?	Pooled together	Out of scope
Is the effect different for E-E-A-T-heavy verticals?	Not segmented	Open
What changes when AI engines update their retrievers?	Snapshot, 7 months	Need replication

independent statistical tests applied to the same data — all converging on the same flat effect

Methodology section

7 mo

window of measurement, long enough to average out short-term engine volatility

Aug 2025 – Mar 2026

Chapter 04

Three mistakes people are making with this study

Mistake — Reading −4.6% as "schema hurts you"

It doesn't. The effect is real but modest, and it's specific to AI Overviews on already-cited pages. Stripping schema from a non-FAQ site is reasonable; assuming schema is a citation penalty isn't.

Mistake — Generalizing to uncited pages

The study can't answer whether schema helps a page get discovered in the first place. Discovery and citation-ranking are different mechanisms. For new content, schema may still matter.

Mistake — Pooling all schema types into one rule

Ahrefs pooled Article + FAQ + Product + HowTo + Organization. A study isolating each type would likely find different effects. Treat the conclusion as "schema-in-aggregate," not as a verdict on Organization or Product.

Chapter 05

What we're doing with this evidence

Recalibrate GEO score weights

Our analyzers awarded heavy points for FAQPage and Article schema based on the old correlation. We're lowering those weights and reallocating to signals the Princeton GEO paper measured causally: inline citations, statistics, expert quotes.

Our roadmap

Stop telling users schema is "most cited by AI"

That claim has no primary source. The Ahrefs study is the closest thing to evidence and it points the other way. We're rewriting the analyzer copy to make claims we can defend.

Our roadmap

Add a discoverability-vs-citation split

The study's main caveat is that uncited pages may still benefit. We're separating GEO scoring into discoverability signals (where schema may help) and citation signals (where it doesn't, per the study).

Internal roadmap

Weight Princeton-validated tactics higher

Inline statistics with citations and authoritative quotations are the strongest content levers the Princeton GEO study measured — it lifts visibility up to 40% overall, with quotations the single strongest method. These deserve more weight than schema in any AI-search score.

arXiv:2311.09735

Detect schema-content mismatch

If schema is at best neutral, schema that lies about your page is strictly negative — wasted budget and a signal AI engines may discount. We're shipping a check for schema honesty, not schema coverage.

Our roadmap

Replicate before scaling conclusions

One study, even a good one, is not the final word. We're flagging the Ahrefs result as the current best evidence while watching for replications and type-level studies. If a follow-up flips the result, we flip our scoring.

Our policy

Chapter 06

How to think about schema in 2026

The Ahrefs study doesn't say "don't use schema." It says "don't use schema as your AI citation strategy." That distinction matters because schema still has uses — rich results in Bing, type-specific features in Google (Product, LocalBusiness, Event), and discoverability for new content. The right frame is below.

Use schema for what it's documented to do

Product / LocalBusiness / Event / Recipe schema still trigger documented features in Google Search. Use those types where they honestly describe the page.

Don't use schema to chase AI citations

The best causal evidence we have says it doesn't move the needle for already-cited pages. Put the budget on inline citations, statistics and original analysis instead.

Audit honesty before coverage

Schema that misrepresents content is the failure mode platforms have been punishing since 2023. Honest, narrow schema beats broad, aspirational schema every time.

Primary sources

Industry analysis

Ahrefs — Does schema markup help with AI citations? (causal study, 2026)ahrefs.com

Joost de Valk — The FAQ schema cycle (May 2026)joost.blog

Official platform docs

Google Search Central — AI features and your website (Dec 2025)developers.google.com

Google Search Central — FAQPage structured data (May 2026 deprecation)developers.google.com

Studies

Aggarwal et al. — GEO: Generative Engine Optimization (Princeton / IIT Delhi, KDD 2024)arxiv.org/abs/2311.09735

← All research FAQ schema is dead →GEO research →How That SEO Agent works →

Get a causal-evidence GEO score

Score your site on what the research actually proves.

We weight your GEO score by what peer-reviewed and causal studies measure — not by what 2023 SEO blogs claim. The Ahrefs study is changing how we score schema. Run an audit and see the new model in action.

Free during beta · Causal evidence only