AI crawlers
CCBot
Updated 2026-05-17
CCBot is the crawler for Common Crawl, an open-source web archive that feeds the training corpora of many smaller AI engines and academic models. Blocking CCBot reduces (but doesn't eliminate) inclusion in derivative AI training datasets.
Where this comes up
See how your site scores on CCBot + every other AI-discoverability signal.
Free audit