Causal study · Primary source · 2026
Schema doesn't lift AI citations. Here's the experiment that proved it.
Ahrefs ran the only controlled study comparing pages that added JSON-LD schema against pages that didn't. Over seven months and three AI surfaces, the causal effect was flat or negative. This is what the methodology, numbers and caveats actually say.
Chapter 01
Why this study matters more than every other schema claim
For the last two years the SEO industry has been making two contradictory claims about schema and AI search. The first: "AI engines love structured data, add as much as you can." The second: "Schema doesn't matter for LLMs, content is all that matters." Both camps were arguing from intuition, not data.
In 2026 Ahrefs published the first causal study on the question. The methodology matters because the noise is enormous: pages that have schema also tend to have better SEO, more authority, faster sites, and cleaner content. Correlation is trivial. Causation is the only number worth quoting.
The team identified 1,885 pages that added JSON-LD schema between August 2025 and March 2026, and matched them against 4,000 control pages with similar pre-treatment citation rates. They then measured citation changes across Google AI Overviews, Google AI Mode, and ChatGPT using a difference-in-differences design and four separate statistical tests.
Pre-treatment, the correlation looked like a slam dunk. Cited pages were nearly 3× more likely to have JSON-LD than uncited pages. If you stopped there — as most blog posts do — you would walk away convinced schema is a citation lever. The controlled experiment is what made the conclusion the opposite.
| Layer | Value |
|---|---|
| Initial dataset analyzed for correlation | 6,000,000 URLs |
| Pages that added JSON-LD (treated)— Across 7 months | 1,885 |
| Matched control pages— Similar pre-treatment levels | 4,000 |
| Statistical tests applied | 4 independent tests |
| AI surfaces measured | AI Overviews + AI Mode + ChatGPT |
Chapter 02
The numbers, surface by surface
The headline finding is that adding JSON-LD schema produced no major lift on any AI surface measured. On AI Mode the effect was +2.4%, within the noise floor and statistically not significant. On ChatGPT the effect was +2.2%, also not significant. On Google AI Overviews the effect was −4.6%, modest in absolute terms but statistically significant in the direction opposite to what the industry assumed.
Reading the −4.6% on AI Overviews requires care. It does not mean schema actively hurts you on average. It means that, among pages already being cited heavily, adding schema did not move them up and may have shifted some out of a particular surface. The study authors flag this as worth investigating, not as a finding to over-extrapolate.
The +2.4% / +2.2% on AI Mode and ChatGPT are easier to read. They are too small and too noisy to call a real effect. In practical terms, schema produced no measurable advantage on the two surfaces with the most aggressive citation behavior.
| Surface | Effect of adding schema | Significance |
|---|---|---|
| Google AI Overviews | −4.6% | Significant (modest) |
| Google AI Mode | +2.4% | Not significant |
| ChatGPT Search | +2.2% | Not significant |
Chapter 03
What the methodology rules out
Diff-in-diff with matched controls is one of the cleaner research designs available outside of randomized trials. It works by comparing the change in citation rate for treated pages against the change for matched controls over the same period. If schema caused a lift, treated pages should accelerate relative to controls. They didn't.
The four independent statistical tests reduce the chance that a single quirky model produced the result. The 7-month window is long enough that any short-term volatility in AI engines should average out. The 6M URL screening reduces the risk of an unrepresentative sample.
What the design can't rule out: indirect effects (schema improving discoverability before the citation horizon), type-level differences (Product schema may behave differently than FAQ), and effects on uncited pages. These are real limitations, but they constrain the claim, not weaken it. The claim is narrow and well-supported: for already-cited pages, in the surfaces studied, schema is not a citation lever.
This is the strongest piece of evidence we have. Anything claiming the opposite has to either point to a better-designed experiment (there isn't one yet) or argue from confounded correlation (which the Ahrefs team explicitly tested and ruled out).
| Question | What the study answers | What it doesn't |
|---|---|---|
| Does adding schema lift citations for already-cited pages? | No, on the surfaces measured | — |
| Does schema help uncited pages get discovered? | Out of scope | Open question |
| Do specific schema types behave differently? | Pooled together | Out of scope |
| Is the effect different for E-E-A-T-heavy verticals? | Not segmented | Open |
| What changes when AI engines update their retrievers? | Snapshot, 7 months | Need replication |
Chapter 04
Three mistakes people are making with this study
Chapter 05
What we're doing with this evidence
Chapter 06
How to think about schema in 2026
The Ahrefs study doesn't say "don't use schema." It says "don't use schema as your AI citation strategy." That distinction matters because schema still has uses — rich results in Bing, type-specific features in Google (Product, LocalBusiness, Event), and discoverability for new content. The right frame is below.
Get a causal-evidence GEO score
Score your site on what the research actually proves.
We weight your GEO score by what peer-reviewed and causal studies measure — not by what 2023 SEO blogs claim. The Ahrefs study is changing how we score schema. Run an audit and see the new model in action.
Free during beta · Causal evidence only