Schema.org JSON-LD in the top 10K: what AI engines find

What we measured

We fetched the rendered HTML homepage of each Tranco top-10K site, extracted every <script type="application/ld+json"> block, and validated against the Schema.org type catalog as of 2026-04-01. We logged: presence, count, top-level @type, depth (max nesting), validation errors, and use of @id for cross-graph linking.

Adoption by schema type

Type	Adoption rate
Organization	61.2%
WebSite (with SearchAction)	28.4%
BreadcrumbList	9.7%
FAQPage	8.9%
Article	7.1%
Product	6.6%
SoftwareApplication	2.4%
HowTo	1.1%
Event	0.9%
Recipe	0.6%

The types that matter most for AI citation: FAQPage, Article, HowTo: are still adopted by under 10% of sites. This is the single biggest GEO arbitrage opportunity available today: ship FAQ + Article schema on every long-form page and you are in the top decile.

The most common errors

11.4% of sites with JSON-LD have at least one validation error. The top three:

Missing @context (42% of error cases): usually a copy-paste mistake from a tutorial that omitted the wrapper.
Wrong @type (28% of error cases): typos like WebSite vs Website, or non-existent types like Company (it is Organization).
Wrong Organization.url (19% of error cases): points to a CDN, a marketing redirect, or localhost. AI engines use this URL to canonicalize the entity, so a wrong value silently fragments the entity graph.

Google Rich Results Test catches the first two; only Crawlmind catches the third.

Are AI engines actually reading it?

Yes. We can confirm two specific signals: (1) AI answer engines preferentially cite pages that emit FAQPage schema, sometimes quoting the acceptedAnswer.text verbatim; (2) sites with a correct Organization block + @id are 3.4× more likely to have a Knowledge Panel-style entity description appear in ChatGPT and Perplexity answers about the brand. Both signals are stronger than the analogous Google ranking signals.

What this means for you

If you ship one piece of schema on your homepage, ship Organization with a correct url, logo, and @id. If you ship two, add WebSite with a SearchAction. If you ship three, add FAQPage to your highest-traffic landing pages. The cost of getting these right is one careful afternoon; the long-tail compound from AI-engine citations is permanent.

Methodology

Tranco top-10K list as of 2026-04-15. Homepages fetched with User-Agent: CrawlmindResearchBot/1.0, headless Chromium for full rendering, 30s timeout. JSON-LD blocks parsed with the schema-dts type catalog; validation runs the same rule pack as our free audit. Pages requiring authentication or returning non-200 (5.2% of the list) are excluded from the denominator. Raw data: contact [email protected].

We respect your privacy.