We respect your privacy.

We use strictly necessary cookies to keep you signed in and to protect against CSRF. With your permission we also use a small amount of first-party analytics to improve the product. We do not sell your data and we do not use third-party advertising trackers. See our cookie policy and privacy policy .

Free tool

AI crawler access checker

See which AI bots — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, OAI-SearchBot and more — your site allows or blocks. Pulls your live robots.txt and reports each one.

Which AI bots matter?

GPTBot

OpenAI — Training data + ChatGPT

Most-blocked bot today; allow to be in training sets.

OAI-SearchBot

OpenAI — ChatGPT search live fetch

Separate from GPTBot — controls inclusion in ChatGPT search results.

ChatGPT-User

OpenAI — User-triggered fetches

Fires when a ChatGPT user explicitly browses a URL.

ClaudeBot

Anthropic — Training + Claude search

Replaced the older anthropic-ai user-agent.

Anthropic-AI

Anthropic — Legacy training crawler

Older name, still respected for backwards compatibility.

PerplexityBot

Perplexity — Live search index

Perplexity is currently the most citation-friendly AI engine.

Google-Extended

Google — AI Overviews + Gemini training

Separate from Googlebot; only controls AI training, not classic search indexing.

Applebot-Extended

Apple — Apple Intelligence training

Separate from Applebot which still controls Siri/Spotlight.

CCBot

Common Crawl — Open-data corpus

Feeds many smaller AI engines and academic models.

Bytespider

ByteDance — TikTok / Doubao training

Aggressive crawler — most sites block by default.

Meta-ExternalAgent

Meta — Llama training

Newer crawler announced for Meta AI training.

Amazonbot

Amazon — Alexa + Rufus

Powers Alexa answers and Amazon Rufus.

Want the full audit?

Crawlmind crawls every page on your site, validates schema, scores citation readiness, generates a draft llms.txt, and tracks which AI engines actually cite you.