Does this tool also check meta robots and X-Robots-Tag?

Yes. It fetches the page itself, reads the X-Robots-Tag HTTP header (including per-bot scoped directives), and parses the tag. If a bot is allowed by robots.txt but receives a noindex directive, the report flags it.

Free Tool · LLM Visibility

Instantly check if Google bots and AI crawlers can access your website.

Paste any URL. The tool fetches your robots.txt and the page itself, then runs the exact decision path that GPTBot, ClaudeBot, PerplexityBot, Googlebot, and others would follow. When a bot is blocked, you see the exact rule and line number that did it.

10 requests per minute per IP · Always fresh, no caching · Fail-closed on 5xx robots.txt

Enter a domain or path — e.g. avinashvagh.com or avinashvagh.com/blog.

How the check works

Fetch /robots.txt from the URL’s origin and parse it per RFC 9309 (groups, longest-match, Allow wins ties).
For each bot in the catalog, pick the most specific User-agent group, then run the URL path against its Allow / Disallow rules.
If the bot is allowed, fetch the page once and inspect the X-Robots-Tag response header and <meta name="robots"> tag for noindex directives.
Group results into AI Training, AI Search, Search Engines, and Social Preview so you can act on what matters for visibility.

What this report tells you

Whether GPTBot, ClaudeBot, and PerplexityBot can train on or cite your content.
Whether you’ve accidentally blocked Googlebot, Bingbot, or Applebot with a wildcard rule.
Whether a hidden X-Robots-Tag noindex header is silently keeping the page out of indexes.
The exact line in robots.txt that triggered each block — so you can fix it in seconds.

Bots we check

Eight categories covering search engines, AI training and search bots, social previews, SEO crawlers, scrapers, cloud-provider AI grounding, Google’s specialised bots, and archive/research agents.

Search Engines (13)

Traditional search engine crawlers. Blocking these removes you from organic search.

Googlebot
Bingbot
adidxbot
BingPreview
DuckDuckBot
YandexBot
Baiduspider
Applebot
Slurp
PetalBot
SeznamBot
MojeekBot
Amzn-SearchBot

AI Bots (21)

AI training and live-retrieval bots. Allow these to be cited in AI answers; block them to keep your content out of model training.

GPTBot
OAI-SearchBot
ChatGPT-User
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
PerplexityBot
Perplexity-User
Google-Extended
Applebot-Extended
Meta-ExternalAgent
cohere-ai
MistralAI-User
YouBot
DuckAssistBot
AI2Bot
DeepSeekBot
PanguBot
Grok
img2dataset

Social Bots (8)

Generate link previews on social networks and chat apps when your URL is shared.

facebookexternalhit
FacebookBot
Meta-ExternalFetcher
Twitterbot
LinkedInBot
Slackbot
Pinterestbot
Quora-Bot

SEO Tools (12)

Third-party SEO crawlers and audit tools used by competitors and agencies.

AhrefsBot
SemrushBot
MJ12bot
DotBot
DataForSeoBot
BLEXBot
SearchmetricsBot
AwarioRssBot
AwarioSmartBot
Screaming Frog SEO Spider
Chrome-Lighthouse
Google Page Speed Insights

Scrapers (12)

Aggregators, datasets, and general-purpose scraping frameworks.

CCBot
Bytespider
Diffbot
FriendlyCrawler
ImagesiftBot
Scrapy
news-please
omgilibot
magpie-crawler
Firecrawl
Crawl4AI
NovaAct

Cloud Services (4)

Cloud-native AI grounding and security crawlers from major cloud providers.

Amazonbot
Amzn-User
Google-CloudVertexBot
AliyunSecBot

Google Bots (9)

Google's specialized crawlers: images, video, news, inspection, and AI Overviews.

Googlebot-Image
Googlebot-Video
Googlebot-News
Storebot-Google
Google-InspectionTool
GoogleOther
GoogleOther-Image
GoogleOther-Video
Gemini-Deep-Research

Other Agents (6)

Archives, plagiarism checkers, and niche crawlers that don't fit the other buckets.

archive.org_bot
ia_archiver
TurnitinBot
Brandwatch
Meltwater
ISSCyberRiskCrawler

FAQ

How can I tell if GPTBot or ClaudeBot is blocked on my site?

Paste any URL above. The tool fetches your origin’s /robots.txt, parses it per RFC 9309, and runs the URL path against each bot’s product token. If a Disallow rule matches, the report shows you the exact line — including which User-agent group it came from — so you can edit the right line.

What’s the difference between Google-Extended and Googlebot?

Googlebot crawls for Google Search results. Google-Extended is a separate signal that opts your content in or out of Gemini training and grounded AI answers. You can allow one and block the other — this tool shows both verdicts side by side.

Does the tool also check meta robots and X-Robots-Tag?

Yes. After Gate 1 (robots.txt), the tool fetches the page and inspects the X-Robots-Tag response header (including per-bot scoped directives like googlebot: noindex) and the page’s <meta name="robots"> tag. If a bot is allowed by robots.txt but blocked by a header or meta tag, you’ll see that flagged.

Why does the report say a bot is blocked when my robots.txt looks fine?

Three common causes: a wildcard Disallow under User-agent: * that catches more URLs than intended, a more-specific group that overrides your default Allow, or a 5xx error on /robots.txt itself — major search engines treat a 5xx as “block everything”. The report surfaces the exact rule and line number so you can pinpoint the cause.