We respect your privacy.

We use strictly necessary cookies to keep you signed in and to protect against CSRF. With your permission we also use a small amount of first-party analytics to improve the product. We do not sell your data and we do not use third-party advertising trackers. See our cookie policy and privacy policy .

← All posts

The anatomy of an AI citation

Crawlmind Engineering··5 min read

An AI citation is the moment a generative engine pulls a specific passage from your page into the answer it writes for a user, and credits your URL as the source. Understanding what gets pulled, and why, is the whole game in Generative Engine Optimization. A citation is not your whole page being read and summarized. It is one extractable chunk being lifted, attributed, and shown. The pages that win are the ones built so that a clean chunk is easy to find.

#A citation has three parts

Strip a citation down and it has three components: a query the engine is answering, a passage it lifts from your page, and an attribution link back to your URL. You only control the middle one. The passage is the unit that matters, so the practical question is never "is my page good," it is "does my page contain a sentence or block the engine can lift verbatim and stand behind."

That reframes the work. You are not writing to rank a page. You are seeding a page with self-contained, liftable passages, then making sure the engine can crawl and parse them.

#What actually gets quoted

The clearest picture of what engines lift comes from large citation audits. Wix's AI Search Lab analyzed 75,000 AI answers containing 1,056,727 citations across ChatGPT, Google AI Mode, and Perplexity. Three content types dominated: listicles at 21.9%, articles at 16.7%, and product pages at 13.7%, together more than half of all citations. The pattern held by intent. For commercial queries, listicles drew nearly 41% of citations, almost double their share elsewhere, because a "best X for Y" list is already shaped like the comparison a buyer asked for.

A separate audit by Omniscient Digital looked at 23,387 citations across 240 branded queries and found reviews, listicles, forums, and case studies, the social-proof bucket, took 57% of branded-query citations, while classic brand pages barely registered. About Us pulled 1.92% and FAQ pages 0.41%. The lesson is blunt. Engines quote pages that resolve a question, not pages that describe a company.

#The shape of a liftable passage

Content type tells you which pages get cited. The passage-level traits tell you which sentences do. The most useful field data here comes from an audit by analyst Adam Gnuse, reported in Search Engine Land, covering 15 domains and 7,500 ChatGPT referral sessions in September 2025.

The standout trait was what he called an answer capsule: a self-contained block of roughly 120 to 150 characters placed right after a title or H2, written to answer a question directly. 72.4% of cited blog posts contained one. Original or owned data was the second strongest trait, present in 52.2% of cited posts. One detail is easy to miss and worth acting on: more than nine in ten of those capsules, around 91%, contained no links at all. A liftable passage is clean. No inline links breaking it up, no hedging, no setup. Just the answer.

So the anatomy of a quotable passage looks like this. It sits high on the page, directly under a heading that matches the question. It is short enough to lift whole. It makes one specific claim. And where possible it carries a number or a named source, because a concrete claim is easier for an engine to stand behind than a vague one.

#Why concrete claims get lifted

This is not folklore. The original GEO research paper from Princeton, Georgia Tech, Allen AI, and IIT Delhi tested optimization methods across generative engines and found that adding citations, quotations, and statistics to a page could raise its visibility in generated answers by up to 40%. Keyword stuffing, the old SEO reflex, did not help. Making claims specific and sourced did.

The mechanism is intuitive once you picture the engine's job. It is assembling an answer and looking for sentences it can drop in with confidence. "Our platform is fast and reliable" is not liftable, it commits the engine to nothing. "Median crawl completes in under four seconds across the sites we audit" is liftable, because it is specific, attributable, and reads as a fact. Concrete claims survive the lift. Adjectives get summarized away.

#Each engine quotes differently

The three big engines do not pull from the same places, so a single page can get cited by one and ignored by another. In an independent study of 118,000 AI answers, only 11% of cited domains appeared across multiple platforms, which tells you how divergent their source pools are. Perplexity leans on discussion and forum content more than the others, with 17% of its citations coming from discussions, more than double the other models. ChatGPT skews toward articles and encyclopedic references. Gemini and Google's AI surfaces pull a higher share from brand-owned, structured pages.

The takeaway is not to chase each engine separately. It is that breadth matters. A liftable claim that exists only on your own product page can reach Gemini and miss Perplexity entirely. The same claim, also present in a third-party review, a forum answer, and a comparison post, has three more doors into the engines that prefer those formats.

#How to use the anatomy

Turn it into an editing pass, not a rewrite. Take a page that should be getting cited and is not, and check four things. Does the first block under each heading answer the heading's question in one tight, link-free passage. Is every important claim specific enough to lift, with a number or named source where one honestly exists. Does the page resolve one question well, rather than ten questions vaguely. And does the claim live anywhere off your own domain, where the engines that distrust brand pages can still find it.

A citation is small. One passage, lifted and attributed. Build pages out of passages worth lifting, put them where the engine looks first, and keep them clean enough to quote without editing. That is the whole anatomy, and it is most of the work.

Related field notes

Share or discuss

Field notes in your inbox

New posts, no spam. Roughly monthly. Unsubscribe with one click.