Free tool
AI crawler access checker
See which AI bots — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, OAI-SearchBot and more — your site allows or blocks. Pulls your live robots.txt and reports each one.
Which AI bots matter?
GPTBot
OpenAI — Training data + ChatGPT
Most-blocked bot today; allow to be in training sets.
OAI-SearchBot
OpenAI — ChatGPT search live fetch
Separate from GPTBot — controls inclusion in ChatGPT search results.
ChatGPT-User
OpenAI — User-triggered fetches
Fires when a ChatGPT user explicitly browses a URL.
ClaudeBot
Anthropic — Training + Claude search
Replaced the older anthropic-ai user-agent.
Anthropic-AI
Anthropic — Legacy training crawler
Older name, still respected for backwards compatibility.
PerplexityBot
Perplexity — Live search index
Perplexity is currently the most citation-friendly AI engine.
Google-Extended
Google — AI Overviews + Gemini training
Separate from Googlebot; only controls AI training, not classic search indexing.
Applebot-Extended
Apple — Apple Intelligence training
Separate from Applebot which still controls Siri/Spotlight.
CCBot
Common Crawl — Open-data corpus
Feeds many smaller AI engines and academic models.
Bytespider
ByteDance — TikTok / Doubao training
Aggressive crawler — most sites block by default.
Meta-ExternalAgent
Meta — Llama training
Newer crawler announced for Meta AI training.
Amazonbot
Amazon — Alexa + Rufus
Powers Alexa answers and Amazon Rufus.
Want the full audit?
Crawlmind crawls every page on your site, validates schema, scores citation readiness, generates a draft llms.txt, and tracks which AI engines actually cite you.