We respect your privacy.

We use strictly necessary cookies to keep you signed in and to protect against CSRF. With your permission we also use a small amount of first-party analytics to improve the product. We do not sell your data and we do not use third-party advertising trackers. See our cookie policy and privacy policy .

← All posts

How to measure GEO with citation share

Crawlmind Engineering··5 min read

Citation share is the percentage of AI-generated answers on a defined set of queries that cite your site as a source. It is the closest thing GEO has to a north-star metric, and unlike a keyword ranking it does not exist until you sample for it. There is no public scoreboard for who ChatGPT, Perplexity, Gemini, or Google AI Overviews quoted yesterday. You have to ask the questions yourself, record the answers, and count.

That sampling requirement is the whole reason GEO measurement feels harder than SEO measurement. A rank tracker can pull a deterministic position for a keyword. An AI answer is generated fresh each time, varies between runs, and changes as the model and its index update. So before you can report a number, you have to decide what you are counting and how many times you are going to count it.

#Two things people call "citation share"

The term gets used for two different measurements, and they answer different questions.

Entity share of voice counts how often your brand is named in the answer text, whether or not your page is linked. This tracks whether the model "knows" you belong in the conversation. If someone asks "best GEO tools" and the answer lists you alongside three competitors, that is one entity mention for each of the four brands.

Citation share counts how often your page appears as a linked or footnoted source. This tracks whether your specific content is doing the work of grounding the answer. The two diverge constantly. A model can recommend your brand from memory while citing a competitor's blog post as the source, or cite your documentation while never naming you in prose.

Pick one as your primary metric and be explicit about which. For most GEO programs the citation count is the more actionable signal, because a citation is a thing you can earn by changing a page. An entity mention often depends on brand recognition that content alone cannot move quickly.

#The formula

Citation share is a ratio:

(answers that cite your site / total answers sampled) x 100

The denominator is your query set, not the whole internet. That makes the metric only as meaningful as the questions you choose. A useful query set is the list of questions a real buyer or user would type while evaluating your category: definitional questions, "best X for Y" questions, comparison questions, and the specific problems your product solves. Write them down once, version them, and reuse the exact same set every measurement cycle. The moment the query set drifts, your trend line stops meaning anything.

A more sensitive variant weights each citation by where it appears. The Princeton team that named the GEO field built its evaluation around position-weighted visibility rather than a raw count, on the logic that a source quoted at the top of an answer shapes the response more than one buried at the bottom. Their controlled experiments found that content optimizations could boost visibility by up to 40% in generative engine responses, which is the kind of swing you only see if your metric is sensitive enough to register where a citation lands, not just whether it exists.

#Sample enough times to beat the noise

A single query run is close to worthless as a measurement. Ask the same question twice and you can get different sources, a different brand list, even a different overall recommendation. Point-in-time AI visibility is genuinely unstable, so a number pulled from one run tells you what the model said once, not what it tends to say.

The fix is repetition. Run each query in your set multiple times, on a fixed cadence, and average the result. The practitioner consensus that has formed around this lands at roughly 30 samples per query as the point where the average stabilizes enough to trust. You do not need to hit that exact figure, but you do need to treat a citation-share number as a distribution, not a reading. In our audits the difference between a one-shot check and a repeated sample is the difference between an anecdote and a metric.

This is also why the trend matters more than any single value. Absolute citation share on a Tuesday is noise. Citation share averaged across 30 runs, tracked week over week, is signal. Report the direction.

#Track it across engines, separately

ChatGPT, Perplexity, Gemini, Copilot, and Google AI Overviews do not share an index or a citation behavior, so a blended cross-engine score hides more than it shows. You can be the dominant source in Perplexity and invisible in Google AI Overviews at the same time. Keep a separate citation-share series per engine. When one engine moves and the others do not, that usually points to a specific cause, a crawler that was blocked, a page that got reindexed, a schema change one engine reads and another ignores.

#Why the number is worth the trouble

Citation share is a leading indicator for traffic that converts unusually well. Ahrefs measured its own funnel over a 30-day window and found that AI search drove 0.5% of traffic but 12.1% of signups, a conversion rate the company put at roughly 23 times its organic baseline. That is one company's data, so treat the exact multiple as illustrative rather than universal. The pattern behind it is the durable point: people arriving from an AI answer have already had your relevance vouched for by the model, so they show up further down the funnel. Citation share is the metric that tells you whether you are earning that traffic before it shows up in your analytics.

#A measurement loop you can actually run

Put it together as a repeatable cycle. Define a stable query set that mirrors real buyer questions. Decide whether your primary metric is entity mentions or citations, and stick with it. Sample each query enough times that the average settles, on a fixed schedule. Keep the series separate per engine. Then watch the slope, not the point.

The teams that win at GEO are not the ones with the highest reading on any given day. They are the ones who measured consistently enough to know which content change moved the line. If you want the underlying mechanics of what gets quoted in the first place, our breakdown of what AI engines actually cite covers the page-level signals that turn into citation share.

Related field notes

Share or discuss

Field notes in your inbox

New posts, no spam. Roughly monthly. Unsubscribe with one click.