Skip to content

Causal study · Primary source · 2026

Schema doesn't lift AI citations. Here's the experiment that proved it.

Ahrefs ran the only controlled study comparing pages that added JSON-LD schema against pages that didn't. Over seven months and three AI surfaces, the causal effect was flat or negative. This is what the methodology, numbers and caveats actually say.

PublishedMay 16, 202614 min read
Sample
1,885
treated pages that added JSON-LD schema between August 2025 and March 2026
Ahrefs · diff-in-diff causal study
4,000
matched control pages with similar pre-treatment citation levels
Same study · matched controls
−4.6%
Google AI Overviews citations after adding schema (statistically significant)
Ahrefs · 2026
+2.4%
AI Mode citations — within the noise floor, not significant
Same study

Chapter 01

Why this study matters more than every other schema claim

For the last two years the SEO industry has been making two contradictory claims about schema and AI search. The first: "AI engines love structured data, add as much as you can." The second: "Schema doesn't matter for LLMs, content is all that matters." Both camps were arguing from intuition, not data.

In 2026 Ahrefs published the first causal study on the question. The methodology matters because the noise is enormous: pages that have schema also tend to have better SEO, more authority, faster sites, and cleaner content. Correlation is trivial. Causation is the only number worth quoting.

The team identified 1,885 pages that added JSON-LD schema between August 2025 and March 2026, and matched them against 4,000 control pages with similar pre-treatment citation rates. They then measured citation changes across Google AI Overviews, Google AI Mode, and ChatGPT using a difference-in-differences design and four separate statistical tests.

Pre-treatment, the correlation looked like a slam dunk. Cited pages were nearly 3× more likely to have JSON-LD than uncited pages. If you stopped there — as most blog posts do — you would walk away convinced schema is a citation lever. The controlled experiment is what made the conclusion the opposite.

LayerValue
Initial dataset analyzed for correlation6,000,000 URLs
Pages that added JSON-LD (treated)Across 7 months1,885
Matched control pagesSimilar pre-treatment levels4,000
Statistical tests applied4 independent tests
AI surfaces measuredAI Overviews + AI Mode + ChatGPT
the raw correlation everyone quotes — and that the controlled experiment showed was driven by confounders, not by schema itself
Ahrefs · pre-treatment analysis
Technically sophisticated sites have schema AND citations. The shared cause is technical SEO maturity. Stripping that confounder out is the only thing that gives you a causal answer.

Chapter 02

The numbers, surface by surface

The headline finding is that adding JSON-LD schema produced no major lift on any AI surface measured. On AI Mode the effect was +2.4%, within the noise floor and statistically not significant. On ChatGPT the effect was +2.2%, also not significant. On Google AI Overviews the effect was −4.6%, modest in absolute terms but statistically significant in the direction opposite to what the industry assumed.

Reading the −4.6% on AI Overviews requires care. It does not mean schema actively hurts you on average. It means that, among pages already being cited heavily, adding schema did not move them up and may have shifted some out of a particular surface. The study authors flag this as worth investigating, not as a finding to over-extrapolate.

The +2.4% / +2.2% on AI Mode and ChatGPT are easier to read. They are too small and too noisy to call a real effect. In practical terms, schema produced no measurable advantage on the two surfaces with the most aggressive citation behavior.

SurfaceEffect of adding schemaSignificance
Google AI Overviews−4.6%Significant (modest)
Google AI Mode+2.4%Not significant
ChatGPT Search+2.2%Not significant
0
AI surfaces where the causal effect of adding schema was both large and statistically significant in the positive direction
Same study · across 4 tests
Modest
the only significant signal was negative, on AI Overviews — and it was modest in absolute terms
Same study
Caveat
Where schema may still matter
The study's authors are explicit: these results apply only to pages already being cited heavily by AI. Schema may still help discoverability of pages that have never been cited — the experiment cannot answer that. It also pooled all JSON-LD types (Article, FAQ, Product, HowTo, Organization), so type-level effects are unknown.
Ahrefs · ahrefs.com/blog/schema-ai-citations

Chapter 03

What the methodology rules out

Diff-in-diff with matched controls is one of the cleaner research designs available outside of randomized trials. It works by comparing the change in citation rate for treated pages against the change for matched controls over the same period. If schema caused a lift, treated pages should accelerate relative to controls. They didn't.

The four independent statistical tests reduce the chance that a single quirky model produced the result. The 7-month window is long enough that any short-term volatility in AI engines should average out. The 6M URL screening reduces the risk of an unrepresentative sample.

What the design can't rule out: indirect effects (schema improving discoverability before the citation horizon), type-level differences (Product schema may behave differently than FAQ), and effects on uncited pages. These are real limitations, but they constrain the claim, not weaken it. The claim is narrow and well-supported: for already-cited pages, in the surfaces studied, schema is not a citation lever.

This is the strongest piece of evidence we have. Anything claiming the opposite has to either point to a better-designed experiment (there isn't one yet) or argue from confounded correlation (which the Ahrefs team explicitly tested and ruled out).

QuestionWhat the study answersWhat it doesn't
Does adding schema lift citations for already-cited pages?No, on the surfaces measured
Does schema help uncited pages get discovered?Out of scopeOpen question
Do specific schema types behave differently?Pooled togetherOut of scope
Is the effect different for E-E-A-T-heavy verticals?Not segmentedOpen
What changes when AI engines update their retrievers?Snapshot, 7 monthsNeed replication
4
independent statistical tests applied to the same data — all converging on the same flat effect
Methodology section
7 mo
window of measurement, long enough to average out short-term engine volatility
Aug 2025 – Mar 2026

Chapter 04

Three mistakes people are making with this study

Mistake — Reading −4.6% as "schema hurts you"
It doesn't. The effect is real but modest, and it's specific to AI Overviews on already-cited pages. Stripping schema from a non-FAQ site is reasonable; assuming schema is a citation penalty isn't.
Mistake — Generalizing to uncited pages
The study can't answer whether schema helps a page get discovered in the first place. Discovery and citation-ranking are different mechanisms. For new content, schema may still matter.
Mistake — Pooling all schema types into one rule
Ahrefs pooled Article + FAQ + Product + HowTo + Organization. A study isolating each type would likely find different effects. Treat the conclusion as "schema-in-aggregate," not as a verdict on Organization or Product.

Chapter 05

What we're doing with this evidence

01
Recalibrate GEO score weights
Our analyzers awarded heavy points for FAQPage and Article schema based on the old correlation. We're lowering those weights and reallocating to signals the Princeton GEO paper measured causally: inline citations, statistics, expert quotes.
Our roadmap
02
Stop telling users schema is "most cited by AI"
That claim has no primary source. The Ahrefs study is the closest thing to evidence and it points the other way. We're rewriting the analyzer copy to make claims we can defend.
Our roadmap
03
Add a discoverability-vs-citation split
The study's main caveat is that uncited pages may still benefit. We're separating GEO scoring into discoverability signals (where schema may help) and citation signals (where it doesn't, per the study).
Internal roadmap
04
Weight Princeton-validated tactics higher
Inline statistics with citations (+40% visibility) and authoritative quotations (+41% visibility) are the strongest causal levers in any peer-reviewed study to date. These deserve more weight than schema in any AI-search score.
arXiv:2311.09735
05
Detect schema-content mismatch
If schema is at best neutral, schema that lies about your page is strictly negative — wasted budget and a signal AI engines may discount. We're shipping a check for schema honesty, not schema coverage.
Our roadmap
06
Replicate before scaling conclusions
One study, even a good one, is not the final word. We're flagging the Ahrefs result as the current best evidence while watching for replications and type-level studies. If a follow-up flips the result, we flip our scoring.
Our policy

Chapter 06

How to think about schema in 2026

The Ahrefs study doesn't say "don't use schema." It says "don't use schema as your AI citation strategy." That distinction matters because schema still has uses — rich results in Bing, type-specific features in Google (Product, LocalBusiness, Event), and discoverability for new content. The right frame is below.

Use schema for what it's documented to do
Product / LocalBusiness / Event / Recipe schema still trigger documented features in Google Search. Use those types where they honestly describe the page.
Don't use schema to chase AI citations
The best causal evidence we have says it doesn't move the needle for already-cited pages. Put the budget on inline citations, statistics and original analysis instead.
Audit honesty before coverage
Schema that misrepresents content is the failure mode platforms have been punishing since 2023. Honest, narrow schema beats broad, aspirational schema every time.
Primary sources
Industry analysis
Ahrefs — Does schema markup help with AI citations? (causal study, 2026)ahrefs.com
Joost de Valk — The FAQ schema cycle (May 2026)joost.blog
Official platform docs
Google Search Central — AI features and your website (Dec 2025)developers.google.com
Google Search Central — FAQPage structured data (May 2026 deprecation)developers.google.com
Studies
Aggarwal et al. — GEO: Generative Engine Optimization (Princeton / IIT Delhi, KDD 2024)arxiv.org/abs/2311.09735

Get a causal-evidence GEO score

Score your site on what the research actually proves.

We weight your GEO score by what peer-reviewed and causal studies measure — not by what 2023 SEO blogs claim. The Ahrefs study is changing how we score schema. Run an audit and see the new model in action.

Free during beta · Causal evidence only

THAT SEO AGENT

44 SEO tools for Claude, ChatGPT & Cursor. Connect GSC, GA4, and PageSpeed. Stop briefing AI about your own site.

That SEO Agent - Stop briefing AI about your own site. | Product HuntThat SEO agent - Featured on Builders.to
© 2026 THATSEOAGENT.COM · ALL RIGHTS RESERVEDBUILT WITH ♥ FOR SEO PROFESSIONALS