Robots.txt Generator
Toggle the rules you want, get a valid robots.txt with line-by-line explanations. Includes presets for blocking AI training crawlers (GPTBot, ClaudeBot, Perplexity, Google-Extended) and SEO scrapers.
Googlebot ignores Crawl-delay. Bing/Yandex respect it.
User-agent: * Allow: / Disallow: /admin Disallow: /api Disallow: /preview Sitemap: https://example.com/sitemap.xml
Show line-by-line explanation
User-agent: *Applies the rules below to every crawler that doesn't have its own group.
Allow: /Allows crawling of the site root by default.
Disallow: /adminBlocks /admin and everything beneath it.
Disallow: /apiBlocks /api and everything beneath it.
Disallow: /previewBlocks /preview and everything beneath it.
Sitemap: https://example.com/sitemap.xmlTells crawlers where to find the XML sitemap. Always use an absolute URL.
Frequently asked questions
Where do I put the robots.txt file?
At the root of your domain: yourdomain.com/robots.txt. Subfolder paths (like /blog/robots.txt) are ignored by crawlers. In Next.js App Router, you can also generate it programmatically via app/robots.ts.
Will blocking AI crawlers stop my content from showing up in ChatGPT?
It stops future training crawls — content already crawled stays in the model. The 'AI Search' crawlers (OAI-SearchBot, ChatGPT-User) are separate from training crawlers (GPTBot). Blocking GPTBot keeps you out of future training; blocking OAI-SearchBot keeps you out of ChatGPT's live citations.
Should I block SEO scrapers like Ahrefs and Semrush?
Usually no. Blocking them stops your own SEO research on competitors and prevents your site from appearing in their datasets. The exception is if your content is high-value and you don't want competitors mining your structure.
What's the difference between Disallow in robots.txt and noindex?
robots.txt tells crawlers not to fetch the page. noindex (in a meta tag or HTTP header) tells crawlers they CAN fetch it but shouldn't show it in search results. For sensitive pages, use both. For pages you want crawled but not indexed (like paginated archives), use only noindex.
Does Googlebot respect Crawl-delay?
No. Googlebot ignores Crawl-delay entirely. Bing, Yandex, and Yahoo respect it. To slow Googlebot, use Google Search Console's crawl rate settings instead.