Free tool

robots.txt AI Crawler Checker

Enter any domain and see which AI crawlers may access it — and whether a careless robots.txt rule is silently keeping it out of ChatGPT, Perplexity and Claude answers.

Search bots vs. training bots

The single most common GEO mistake we see: a blanket "block all AI bots" rule added in 2023 to keep content out of training data — which today also blocks the search crawlers that would cite the site in AI answers. Blocking GPTBot is a policy choice. Blocking OAI-SearchBot or PerplexityBot is self-inflicted invisibility.

AI crawler FAQ

Should I block AI crawlers in robots.txt?

Distinguish purposes. Search bots (OAI-SearchBot, PerplexityBot, Claude-SearchBot) fetch pages to cite them in AI answers — blocking those makes you invisible exactly where buyers increasingly ask. Training bots (GPTBot, CCBot, Bytespider) only feed model training; blocking them is a legitimate choice that doesn't hurt AI visibility.

What's the difference between GPTBot and OAI-SearchBot?

Both belong to OpenAI. GPTBot collects content for model training. OAI-SearchBot powers ChatGPT's web search — it decides whether your pages can appear as cited sources in ChatGPT answers. Many sites block GPTBot for policy reasons but should keep OAI-SearchBot allowed.

Does blocking an AI bot in robots.txt actually work?

Major vendors (OpenAI, Anthropic, Google, Perplexity for its crawler) document robots.txt compliance and publish IP ranges for verification. Some bots, notably Bytespider, are widely reported to ignore robots.txt — blocking those requires firewall rules instead.

This check is one of 40+ rules in HejGeo's full site audit — combined with tracking of what AI assistants actually say about you. Run a free audit →