🤖 SEO ✅ 100% Free ⚡ Live Preview

Robots.txt Generator

Build a valid robots.txt file in seconds. Add rules for any user-agent, block directories, set crawl delays, and add your sitemap URL — with live syntax-highlighted preview and validation.

Build your robots.txt

Sitemap URL

Host (canonical domain)

Crawl Rules

Quick add:

🚫 Block /admin/ 🚫 Block /wp-admin/ 🚫 Block /private/ 🚫 Block PDFs ✅ Allow /public/ ✅ Allow Googlebot all 🤖 Block GPTBot (AI) 🤖 Block CCBot (AI) 🚫 Block Bing /search

Crawl Delay (seconds)

Apply Delay To

Options

Block AI crawlers

GPTBot, CCBot, anthropic-ai…

Block bad bots

SemrushBot, AhrefsBot…

Explicit "Allow all"

Adds Allow: / for * agent

Include comments

Adds helpful # annotations

Presets: 📝 Blog 🛍️ E-commerce 🚀 SaaS 🔒 Private site 🤖 Block AI bots 🗑️ Clear

A quality

robots.txt ready

Valid syntax

File Quality Score —

Rules: 0 Lines: 0

📊 File Stats

User-agents

—

Disallow rules

—

Allow rules

—

Sitemap

—

Crawl delay

—

File size

—

✅ Validation

🔏 Generated robots.txt

📊 File Stats

Rules

Agents

Lines

—

Quality

🤖 Common Bot Names

All bots (wildcard)

Googlebot

Google Search crawler

Bingbot

Microsoft Bing crawler

GPTBot

OpenAI training bot

CCBot

Common Crawl / AI data

AhrefsBot

Ahrefs SEO crawler

SemrushBot

Semrush SEO crawler

💡 Best Practices

📍

Place at root: robots.txt must live at yourdomain.com/robots.txt

⚠️

Not a security tool — robots.txt is a polite request, not a block. Use server auth to protect private content.

🗺️

Always include your Sitemap — it helps Google discover all your indexable pages faster.

🔍

Test in GSC — Google Search Console has a built-in robots.txt tester to verify your rules.

🔗 Related Tools

What is robots.txt?

robots.txt is a plain text file placed at the root of your website that tells search engine crawlers and other bots which pages or sections they are allowed or not allowed to crawl. It follows the Robots Exclusion Standard — a protocol all major search engines respect by convention.

When a bot visits your site, it first checks yourdomain.com/robots.txt before crawling anything else. If the file contains rules that apply to that bot, it will follow them. If no file exists, bots treat the entire site as open for crawling.

robots.txt Syntax Guide

User-agent: specifies which bot the following rules apply to. Use * for all bots.
Disallow: tells bots not to crawl a path. Disallow: /admin/ blocks everything under /admin/.
Allow: overrides a Disallow — useful when you want to block a directory but allow a specific page within it.
Crawl-delay: asks bots to wait N seconds between requests. Useful for protecting server load (note: Google ignores this).
Sitemap: points to your sitemap.xml. Can be listed at the bottom of the file regardless of user-agent context.

Frequently Asked Questions

Not directly. Blocking a page with Disallow stops Google from crawling it, but Google can still index a URL it has never visited if it finds links to it on other pages. To prevent a page from appearing in search results entirely, use a noindex meta tag or X-Robots-Tag HTTP header on the page itself. robots.txt and noindex serve different purposes and are often used together.

That's a personal/business decision. OpenAI, Anthropic, and Common Crawl all respect robots.txt, so listing Disallow: / for their user-agents will stop them from training on your content. However, blocking AI crawlers has no impact on your Google rankings and doesn't prevent use of content that was already crawled before you added the rules. Many publishers choose to block them as a matter of principle or licensing.

Yes — and this is common. You can have a general User-agent: * section with broad rules, and then separate sections for specific bots like Googlebot or Bingbot with different rules. When a bot reads the file, it follows the most specific section that applies to it, then falls back to * rules. Most bots stop at the first matching group and ignore other groups.

No — Google officially does not support Crawl-delay in robots.txt. To control Googlebot's crawl rate, use Google Search Console's Crawl Rate Settings instead. Bing and most other crawlers do respect Crawl-delay. It's still worth including if server load is a concern, but don't rely on it for Googlebot specifically.