AI crawlers
ClaudeBot: how to allow Anthropic to crawl your site
Updated 2026-05-17 · by the Crawlmind team
ClaudeBot is Anthropic's primary web crawler. It fetches publicly available pages to gather data used for training future Claude models and to retrieve content when a Claude user explicitly browses a URL. Allow ClaudeBot in robots.txt if you want your content to be eligible for inclusion in Anthropic's training corpus and for citation in Claude.ai answers; block it if you don't.
Allow ClaudeBot
Add this block to your robots.txt:
User-agent: ClaudeBot Allow: /
ClaudeBot will fetch pages reachable from your sitemap and inbound links. There is no separate "allow training but block citation" toggle — ClaudeBot covers both today.
Block ClaudeBot
If you want to opt out:
User-agent: ClaudeBot Disallow: /
Anthropic also publishes the older anthropic-ai and Claude-Web user-agents. Some crawlers still use those — block all three to be exhaustive:
User-agent: anthropic-ai Disallow: / User-agent: Claude-Web Disallow: /
ClaudeBot vs anthropic-ai vs Claude-Web
Anthropic has shipped three crawler user-agents over the years:
| User-agent | Status | Used for |
|---|---|---|
| ClaudeBot | Current | Training + retrieval (the primary bot today) |
| anthropic-ai | Deprecated / legacy | Older training fetches; still seen in some logs |
| Claude-Web | Deprecated / legacy | Older user-triggered fetches |
If you have an existing User-agent: anthropic-ai block, keep it — but also add a User-agent: ClaudeBot block, because most current crawls identify as ClaudeBot, not anthropic-ai.
Does ClaudeBot honor robots.txt?
Yes. Anthropic publicly documents that ClaudeBot honors robots.txt rules and the User-agent: ClaudeBot block in particular. The bot reads the file before each crawl session and respects both Allow and Disallow directives. If you see ClaudeBot fetching a path you blocked, file a report with Anthropic — it's an outlier worth investigating.
Common mistakes
- Blocking only
anthropic-aiand assuming Claude can't reach your site. The current bot identifies as ClaudeBot — block that too. - Allowing ClaudeBot but missing
User-Agent: ChatGPT-User/OAI-SearchBot/PerplexityBot. Different vendors, different decisions. - Order in robots.txt. Most parsers read the most-specific User-agent block first, but a few read top-to-bottom; put named-bot blocks above the wildcard.
Check yours
Use the free AI crawler access checker — paste your URL, see exactly what ClaudeBot (and 11 other AI crawlers) see when they read your robots.txt.
Related
Glossary
See how your site scores
Run a free Crawlmind audit — get every page graded on the rules in this guide.