Standards
robots.txt
Updated 2026-05-17
robots.txt is a plain-text file at the site root that tells web crawlers which paths they may or may not fetch. It is the canonical place to allow or disallow specific AI crawlers like GPTBot, ClaudeBot, and PerplexityBot. Crawlers honor it on a per-User-agent basis — order and specificity matter.
Minimal allow-all example
User-agent: * Allow: / Sitemap: https://example.com/sitemap.xml
Block GPTBot specifically
User-agent: GPTBot Disallow: /
Related
Where this comes up
See how your site scores on robots.txt + every other AI-discoverability signal.
Free audit