Enter any domain to instantly see whether its robots.txt allows or blocks the major AI crawlers — GPTBot, ClaudeBot, Google-Extended, PerplexityBot, CCBot and 15+ more.
We fetch the domain's public robots.txt and evaluate the rules for each known AI crawler. Nothing is stored.
The checker fetches the public /robots.txt of the domain you enter and resolves the directives for each known AI user-agent against the homepage path. It covers training crawlers, answer-engine crawlers, and the training opt-out tokens:
GPTBot, ClaudeBot, CCBot, Bytespider, Meta-ExternalAgent, cohere-ai
OAI-SearchBot, PerplexityBot, YouBot, Googlebot, Amazonbot
Google-Extended, Applebot-Extended
ChatGPT-User, Claude-Web, Perplexity-User
Important: a "blocked" result means the site's robots.txt asks that bot not to crawl — it doesn't guarantee the bot obeys. robots.txt is a signal; real enforcement lives at the WAF/CDN. Read why in robots.txt vs llms.txt vs ai.txt.
AI crawler blocking, Pay-Per-Crawl, and the data wars in full.
What each file does, what it can't, and why the WAF is the real control.
How AI agents collect web data and why the IP layer decides.
hiQ, Meta v Bright Data, Reddit v Perplexity & DMCA §1201.
The 7-layer detection stack you must pass as a real visitor.
Real 4G/5G carrier IPs for legitimate public-data collection.