We respect your privacy.

We use strictly necessary cookies to keep you signed in and to protect against CSRF. With your permission we also use a small amount of first-party analytics to improve the product. We do not sell your data and we do not use third-party advertising trackers. See our cookie policy and privacy policy .

← All posts

Document order beats heading level for AI

Crawlmind Engineering··3 min read

Heading hierarchy is the order your <h1> through <h6> tags appear in the source, not a ranking of their levels. That distinction matters because crawlers and AI extractors build a page's outline by walking the headings in document order, top to bottom, exactly as they sit in the HTML. If your source order and your visual order disagree, the machine reads the source order, and your carefully designed page structure can come out scrambled.

This is one of those issues that looks fine to a human and breaks silently for a parser.

#What "document order" means

When a parser extracts headings, it does not sort them by level. It reads them in the sequence they appear in the DOM. So this HTML:

<h2>Pricing</h2>
<h1>Our Product</h1>
<h3>Enterprise</h3>

produces the outline H2 → H1 → H3, not the tidy H1 → H2 → H3 you might assume. A human skimming the rendered page, where CSS may have repositioned things, sees a sensible layout. The parser sees a page that opens at level 2, jumps up to level 1, then down to level 3: disordered.

Browsers never implemented the old HTML5 document-outline algorithm, and it was dropped from the WHATWG spec around 2022, so a heading's level is now taken literally in the order it appears. The HTML spec is explicit that the outline follows source order; the MDN heading-elements reference spells out the accessibility and structure rules that depend on it. Screen readers walk the same source order, so this is an accessibility issue as much as an SEO one.

#The skipped-level trap

The most common real failure is a skipped level: jumping from an <h1> straight to an <h3> with no <h2> between them. A valid outline increases by at most one level at a time. A jump of two or more tells a parser a section is missing, and it has to guess how to nest what follows.

Why it happens is almost always styling. A designer wants a heading that looks smaller, picks <h3> for its default size instead of styling an <h2> with CSS, and the document structure now lies about the page's shape. The fix is to choose the heading level for its meaning and control size with CSS.

#Why AI extractors care

AI answer engines lean on the heading outline to decide what a page is about and which chunk answers a given question. A clean outline tells the engine "this H2 is a top-level section, these H3s are its subsections," which makes the page easy to segment and quote. A scrambled or skipped outline forces the engine to fall back on weaker signals, and pages that are harder to segment get cited less.

The same outline drives accessibility, and the numbers there are striking. In WebAIM's 2024 screen-reader survey, roughly 68% of respondents said headings are how they navigate a page first, ahead of every other method. HTML has had exactly 6 heading levels since the spec's earliest days, and WCAG, the accessibility standard first published as 2.0 in 2008 and updated in 2.1 (2018) and 2.2 (2023), has treated a logical heading order as a baseline requirement throughout. So the cost of a broken outline is not abstract: it is paid by the 2 in 3 assistive-tech users who rely on it and by every engine that mirrors their top-to-bottom reading.

It is the same principle behind a definitional opening sentence: structure the page so a machine can lift the right piece without guessing.

#How to check your own pages

You do not need special tooling. Two passes catch most problems:

  1. Read the headings in source order. View source (not the rendered page) and list the <h1>-<h6> tags top to bottom. Confirm there is exactly one <h1>, and that levels never jump up by more than one.
  2. Ignore the CSS. A heading that looks small can still be an <h2> in the markup, and that is what counts. Judge the tag, not the font size.

If you build pages from components, watch for a component that hard- codes its own heading level: dropped into different contexts, the same component can create a skip on one page and not another.

#The summary

Heading hierarchy is read in document order, not by level, so the sequence of your <h1>-<h6> tags in the source is what crawlers, AI extractors, and screen readers actually see. Keep exactly one <h1>, never jump more than one level at a time, and pick levels for meaning while styling size with CSS. A clean source-order outline is one of the cheapest ways to make a page easy for an answer engine to read and cite.

Share or discuss

Field notes in your inbox

New posts, no spam. Roughly monthly. Unsubscribe with one click.