Free Tool · LLM Visibility

Instantly check if Google bots and AI crawlers can access your website.

Paste any URL. The tool fetches your robots.txt and the page itself, then runs the exact decision path that GPTBot, ClaudeBot, PerplexityBot, Googlebot, and others would follow. When a bot is blocked, you see the exact rule and line number that did it.

10 requests per minute per IP · Always fresh, no caching · Fail-closed on 5xx robots.txt

Enter a domain or path — e.g. avinashvagh.com or avinashvagh.com/blog.

How the check works

  1. Fetch /robots.txt from the URL’s origin and parse it per RFC 9309 (groups, longest-match, Allow wins ties).
  2. For each bot in the catalog, pick the most specific User-agent group, then run the URL path against its Allow / Disallow rules.
  3. If the bot is allowed, fetch the page once and inspect the X-Robots-Tag response header and <meta name="robots"> tag for noindex directives.
  4. Group results into AI Training, AI Search, Search Engines, and Social Preview so you can act on what matters for visibility.

What this report tells you

  • Whether GPTBot, ClaudeBot, and PerplexityBot can train on or cite your content.
  • Whether you’ve accidentally blocked Googlebot, Bingbot, or Applebot with a wildcard rule.
  • Whether a hidden X-Robots-Tag noindex header is silently keeping the page out of indexes.
  • The exact line in robots.txt that triggered each block — so you can fix it in seconds.

Bots we check

Eight categories covering search engines, AI training and search bots, social previews, SEO crawlers, scrapers, cloud-provider AI grounding, Google’s specialised bots, and archive/research agents.

Search Engines (13)

Traditional search engine crawlers. Blocking these removes you from organic search.

  • Googlebot
  • Bingbot
  • adidxbot
  • BingPreview
  • DuckDuckBot
  • YandexBot
  • Baiduspider
  • Applebot
  • Slurp
  • PetalBot
  • SeznamBot
  • MojeekBot
  • Amzn-SearchBot

AI Bots (21)

AI training and live-retrieval bots. Allow these to be cited in AI answers; block them to keep your content out of model training.

  • GPTBot
  • OAI-SearchBot
  • ChatGPT-User
  • ClaudeBot
  • Claude-User
  • Claude-SearchBot
  • anthropic-ai
  • PerplexityBot
  • Perplexity-User
  • Google-Extended
  • Applebot-Extended
  • Meta-ExternalAgent
  • cohere-ai
  • MistralAI-User
  • YouBot
  • DuckAssistBot
  • AI2Bot
  • DeepSeekBot
  • PanguBot
  • Grok
  • img2dataset

Social Bots (8)

Generate link previews on social networks and chat apps when your URL is shared.

  • facebookexternalhit
  • FacebookBot
  • Meta-ExternalFetcher
  • Twitterbot
  • LinkedInBot
  • Slackbot
  • Pinterestbot
  • Quora-Bot

SEO Tools (12)

Third-party SEO crawlers and audit tools used by competitors and agencies.

  • AhrefsBot
  • SemrushBot
  • MJ12bot
  • DotBot
  • DataForSeoBot
  • BLEXBot
  • SearchmetricsBot
  • AwarioRssBot
  • AwarioSmartBot
  • Screaming Frog SEO Spider
  • Chrome-Lighthouse
  • Google Page Speed Insights

Scrapers (12)

Aggregators, datasets, and general-purpose scraping frameworks.

  • CCBot
  • Bytespider
  • Diffbot
  • FriendlyCrawler
  • ImagesiftBot
  • Scrapy
  • news-please
  • omgilibot
  • magpie-crawler
  • Firecrawl
  • Crawl4AI
  • NovaAct

Cloud Services (4)

Cloud-native AI grounding and security crawlers from major cloud providers.

  • Amazonbot
  • Amzn-User
  • Google-CloudVertexBot
  • AliyunSecBot

Google Bots (9)

Google's specialized crawlers: images, video, news, inspection, and AI Overviews.

  • Googlebot-Image
  • Googlebot-Video
  • Googlebot-News
  • Storebot-Google
  • Google-InspectionTool
  • GoogleOther
  • GoogleOther-Image
  • GoogleOther-Video
  • Gemini-Deep-Research

Other Agents (6)

Archives, plagiarism checkers, and niche crawlers that don't fit the other buckets.

  • archive.org_bot
  • ia_archiver
  • TurnitinBot
  • Brandwatch
  • Meltwater
  • ISSCyberRiskCrawler

FAQ

How can I tell if GPTBot or ClaudeBot is blocked on my site?

Paste any URL above. The tool fetches your origin’s /robots.txt, parses it per RFC 9309, and runs the URL path against each bot’s product token. If a Disallow rule matches, the report shows you the exact line — including which User-agent group it came from — so you can edit the right line.

What’s the difference between Google-Extended and Googlebot?

Googlebot crawls for Google Search results. Google-Extended is a separate signal that opts your content in or out of Gemini training and grounded AI answers. You can allow one and block the other — this tool shows both verdicts side by side.

Does the tool also check meta robots and X-Robots-Tag?

Yes. After Gate 1 (robots.txt), the tool fetches the page and inspects the X-Robots-Tag response header (including per-bot scoped directives like googlebot: noindex) and the page’s <meta name="robots"> tag. If a bot is allowed by robots.txt but blocked by a header or meta tag, you’ll see that flagged.

Why does the report say a bot is blocked when my robots.txt looks fine?

Three common causes: a wildcard Disallow under User-agent: * that catches more URLs than intended, a more-specific group that overrides your default Allow, or a 5xx error on /robots.txt itself — major search engines treat a 5xx as “block everything”. The report surfaces the exact rule and line number so you can pinpoint the cause.