GPTBot: how to allow OpenAI to crawl your site

We respect your privacy.

We use strictly necessary cookies to keep you signed in and to protect against CSRF. With your permission we also use a small amount of first-party analytics to improve the product. We do not sell your data and we do not use third-party advertising trackers. See our cookie policy and privacy policy .

Updated 2026-05-17 · by the Crawlmind team

GPTBot is OpenAI's primary web crawler. It fetches publicly available pages to gather data used for training future GPT models and, in some configurations, to ground answers in ChatGPT. If you want your content to be eligible for inclusion in OpenAI's training corpus or ChatGPT answers, allow GPTBot in your robots.txt. If you don't, block it.

Allow GPTBot

Add this block to your robots.txt (above the wildcard User-agent: * block):

User-agent: GPTBot
Allow: /

That's it. GPTBot will fetch pages it can reach via links from your sitemap and the open web.

Block GPTBot

If you want to opt out of training data collection:

User-agent: GPTBot
Disallow: /

Note that this only blocks training. ChatGPT *search* uses a different user-agent (OAI-SearchBot). If you want to be invisible to both, block both.

GPTBot vs OAI-SearchBot vs ChatGPT-User

OpenAI publishes three distinct user-agents:

User-agent	What it does	Block to opt out of
GPTBot	Crawls public pages for training data	OpenAI training
OAI-SearchBot	Crawls + indexes pages for ChatGPT search	ChatGPT search results
ChatGPT-User	Fires when a user explicitly asks ChatGPT to fetch a URL	One-off browsing

Most sites want to *allow* OAI-SearchBot (to be in ChatGPT search) while making an independent decision on GPTBot (training).

Common mistakes

Blocking GPTBot but expecting to be in ChatGPT search. Different bots, different decisions.
**Allowing GPTBot but blocking everything in User-agent: * first.** Order matters in some robots.txt parsers: put the GPTBot block above the wildcard.
Forgetting to block specific paths. Even if GPTBot is allowed, you can Disallow: /admin/ for the GPTBot block to keep it out of staging.

Check yours

Use the free AI crawler access checker: paste your URL, see exactly what GPTBot (and 11 other AI crawlers) see when they read your robots.txt.

See how your site scores

Run a free Crawlmind audit: get every page graded on the rules in this guide.

Free audit Check AI bot access →