Paste a robots.txt, list one or more URLs, and check which paths each user-agent is allowed to crawl. Uses Google’s Robots Exclusion Protocol rules: longest matching pattern wins, with Allow beating Disallow on ties. Sitemap and non-group lines are surfaced separately.
* group is the fallback.* (any sequence) and $ (end-of-path anchor).Allow beats Disallow (Google’s convention).Disallow: means “allow everything,” per the original 1994 spec.Sitemap: and Host: directives are not part of crawl matching but are listed separately for visibility.