We respect your privacy.

We use strictly necessary cookies to keep you signed in and to protect against CSRF. With your permission we also use a small amount of first-party analytics to improve the product. We do not sell your data and we do not use third-party advertising trackers. See our cookie policy and privacy policy .

← All posts

Why Google reports 404s for URLs you never created

Crawlmind Engineering··3 min read

A /cdn-cgi/l/email-protection 404 in Search Console is a phantom error: it points to a URL your CDN generated, not one you ever published. If Google's Coverage report shows "Not found (404)" for a path under /cdn-cgi/ that you don't recognize, you didn't break anything. Cloudflare did something helpful that confused the crawler, and the fix takes one line.

This post explains where that URL comes from, why it 404s, and how to make it disappear from your reports.

#Where the URL comes from

Cloudflare has a feature called Email Address Obfuscation, on by default in its Scrape Shield settings. When your HTML contains a plain email address, like a mailto: link or text in your footer, Cloudflare rewrites it before the page reaches the browser. The visible email is replaced with an encoded token, and the link becomes something like:

<a href="/cdn-cgi/l/email-protection#a1b2c3d4e5">[email&#160;protected]</a>

When a human loads the page, Cloudflare's JavaScript decodes the token and restores the real address, so scrapers that don't run JS never see it. That's the point: it cuts the volume of address-harvest spam without you changing anything.

#Why it 404s for crawlers

The catch is that /cdn-cgi/l/email-protection is not a real page on your site. It only resolves through Cloudflare's edge JavaScript. Googlebot discovers the link in your rendered HTML, tries to fetch it like any other URL, and gets a 404 because the path has no document behind it. Since that obfuscated link usually lives in a site-wide element like the footer, Google sees it on every page and the 404 gets logged once as a representative example.

It is harmless to users and to your rankings. But an unresolved 404 sitting in Search Console is noise, and noise hides real problems. It is worth clearing.

#The fix: one line in robots.txt

The clean fix is to tell crawlers not to fetch Cloudflare's internal paths at all. Add this to your robots.txt:

User-agent: *
Disallow: /cdn-cgi/

/cdn-cgi/ is Cloudflare's reserved namespace for its own endpoints (email protection, challenge pages, analytics beacons). Nothing under it is content you want indexed, so disallowing the whole prefix is safe. The directive follows the standard defined in RFC 9309, which every major crawler honors. Once Google recrawls robots.txt and sees the rule, it stops trying the email-protection URL and the 404 ages out of your report, typically within a couple of weeks.

If you maintain separate User-agent blocks per crawler, add the Disallow: /cdn-cgi/ line to each one, since crawlers obey only the most specific block that matches their name.

#Two alternatives, and why robots.txt wins

You have two other options, both worse for most teams:

  1. Turn off Email Obfuscation in Cloudflare's dashboard. This removes the rewritten link, but it also exposes your real email addresses to the harvesters the feature was protecting you from.
  2. Replace plain-text emails with a contact form. A larger change that only helps if you were going to do it anyway.

Disallowing /cdn-cgi/ keeps the spam protection, needs no template changes, and is a single line. It is the right trade for almost everyone.

#One gotcha: the change won't show up immediately

robots.txt is itself usually cached at the CDN edge, often for hours. After you deploy the new rule, the public URL may keep serving the old version until the cache expires. Confirm the live file with a cache-busting request (append a throwaway query string) and, if your CDN supports it, purge robots.txt so the update is visible right away. Then use the Validate Fix button in Search Console to ask Google to recheck.

#The summary

A /cdn-cgi/l/email-protection 404 is Cloudflare's Email Obfuscation rewriting your on-page emails into links that only resolve with the edge's JavaScript; crawlers follow them and 404. Add Disallow: /cdn-cgi/ to robots.txt, purge the CDN cache so the rule goes live, and Validate Fix. You keep the anti-spam protection and clear the phantom error in one line.

Related field notes

Share or discuss

Field notes in your inbox

New posts, no spam. Roughly monthly. Unsubscribe with one click.