CC
Training crawlersUser agent + IP range
Common Crawl
CCBot
About this crawler
CCBot is tracked as a public-content crawler from Common Crawl.
Usually means: The crawler is collecting public web content that may be used for model training, datasets, or product improvement.
User agent
CCBotFaurya matches these as user-agent tokens, not full browser strings. Providers may add surrounding version or product text.
IP verification
Verified when user agent and IP range both match
Faurya first matches the crawler user-agent token, then compares the request IP with published ranges from the crawler operator when those ranges exist.
User agent
Request user agent contains CCBot.
Request IP
Server-side tracking forwards the crawler IP seen by your backend.
Range check
The IP is compared with published CIDR ranges or JSON range endpoints.