Free online tools for web developers and designers.
35 Tools available.
robots.txt
https://yourdomain.com/robots.txt
sitemap.xml
https://yourdomain.com/sitemap.xml —
Then submit via Google Search Console
and Bing Webmaster Tools.
Common configurations for both files. Click Load into Generator to use an example as a starting point.
Minimal file — all crawlers allowed, sitemap linked.
User-agent: *
Disallow:
Sitemap: https://example.com/sitemap.xml
Allow all, but block admin and private areas.
User-agent: *
Disallow: /admin/
Disallow: /login/
Disallow: /private/
Disallow: /?s=
Allow: /
Sitemap: https://example.com/sitemap.xml
Typical WordPress configuration with WP-Admin protected and AJAX allowed.
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /?s=
Disallow: /?p=
Sitemap: https://example.com/sitemap_index.xml
Allow search engines, block AI training crawlers and aggressive scrapers.
User-agent: *
Disallow:
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: Bytespider
Disallow: /
Sitemap: https://example.com/sitemap.xml
Home, About, Services, Contact.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/
schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<changefreq>monthly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about/</loc>
<changefreq>yearly</changefreq>
<priority>0.5</priority>
</url>
...
</urlset>
Home, blog index, posts with lastmod dates.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/
schemas/sitemap/0.9">
<url>
<loc>https://myblog.com/</loc>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://myblog.com/post-1/</loc>
<lastmod>2025-01-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.7</priority>
</url>
...
</urlset>
Home, product categories, and individual products.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/
schemas/sitemap/0.9">
<url>
<loc>https://myshop.com/</loc>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://myshop.com/products/</loc>
<changefreq>daily</changefreq>
<priority>0.9</priority>
</url>
...
</urlset>
Guidelines for both files — and how they work together.
User-agent: * block. It catches all bots that don't have a specific rule.Sitemap: URL so crawlers can discover all your public pages.Disallow: /admin/ (with trailing slash) blocks the entire directory.Disallow: / blocks everything — including your entire website from Google. Double-check this.noindex meta tag or HTTP header instead.?utm_* or session parameters.<lastmod> accurately. Only set it if you actually track the modification date — don't fake it.<priority> relatively. Use 1.0 for your homepage, lower values for deeper pages. Search engines may ignore it anyway.robots.txt, it shouldn't be in the sitemap.<loc> values must be fully qualified: https://example.com/page/.<changefreq> to "always". Use realistic values — it affects crawl budget.Tells crawlers what they may not access. It's a policy file, not a security measure. Crawlers obey it voluntarily.
Tells crawlers what you want indexed. It's a roadmap — it speeds up discovery but doesn't guarantee indexing.
Never have the same URL in both Disallow (robots.txt) and sitemap. That contradicts itself and confuses crawlers.
Complete specification for both file formats.
| Directive | Example | Support | Description |
|---|---|---|---|
User-agent |
User-agent: * |
All | Target bot. Use * for all, or a specific bot name. |
Disallow |
Disallow: /admin/ |
All | Blocks access to a path. Empty value means allow all. |
Allow |
Allow: /public/ |
All | Explicitly allows a path, overriding a broader Disallow. |
Crawl-delay |
Crawl-delay: 10 |
Most | Seconds between requests. Not supported by Googlebot (use Search Console instead). |
Sitemap |
Sitemap: https://…/sitemap.xml |
All | Full URL of the sitemap. Can appear multiple times. |
Host |
Host: example.com |
Yandex | Yandex-specific directive for preferred domain (canonical host). |
| User-agent | Owner | Type |
|---|---|---|
Googlebot | Search | |
Googlebot-Image | Google Images | Search |
Googlebot-News | Google News | Search |
Bingbot | Microsoft | Search |
Slurp | Yahoo | Search |
DuckDuckBot | DuckDuckGo | Search |
Baiduspider | Baidu | Search |
YandexBot | Yandex | Search |
Applebot | Apple (Siri, Spotlight) | Search |
Yeti | Naver | Search |
SogouSpider | Sogou | Search |
Qwantify | Qwant | Search |
ia_archiver | Internet Archive | Archive |
facebot | Meta / Facebook | Social |
facebookexternalhit | Meta / Facebook (link preview) | Social |
Twitterbot | X / Twitter (card preview) | Social |
LinkedInBot | LinkedIn (link preview) | Social |
GPTBot | OpenAI | AI Training |
ChatGPT-User | OpenAI | AI Training |
OAI-SearchBot | OpenAI | AI Search |
Google-Extended | Google (Gemini training) | AI Training |
anthropic-ai | Anthropic | AI Training |
Claude-Web | Anthropic | AI Training |
ClaudeBot | Anthropic | AI Training |
CCBot | Common Crawl | AI Training |
Bytespider | ByteDance / TikTok | AI Training |
Amazonbot | Amazon (Alexa) | AI Training |
PerplexityBot | Perplexity | AI Search |
Applebot-Extended | Apple (AI training) | AI Training |
meta-externalagent | Meta (AI training) | AI Training |
cohere-ai | Cohere | AI Training |
DiffBot | Diffbot (AI data) | AI Training |
AhrefsBot | Ahrefs | SEO Tool |
SemrushBot | Semrush | SEO Tool |
MJ12bot | Majestic | SEO Tool |
DotBot | Moz / Open Site Explorer | SEO Tool |
SistrixCrawler | SISTRIX | SEO Tool |
rogerbot | Moz | SEO Tool |
ScreamingFrogSEOSpider | Screaming Frog | SEO Tool |
SeobilityBot | Seobility | SEO Tool |
serpstatbot | Serpstat | SEO Tool |
DataForSeoBot | DataForSEO | SEO Tool |
BLEXBot | WebMeUp | SEO Tool |
| Element | Required | Values | Description |
|---|---|---|---|
<urlset> |
Required | — | Root element. Must include the namespace: xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" |
<url> |
Required | — | Parent element for each URL entry. |
<loc> |
Required | Full URL | Fully-qualified URL including protocol. Max 2048 characters. Must be URL-encoded. |
<lastmod> |
Optional | YYYY-MM-DD |
Date the page was last modified. W3C Datetime format. Must be accurate — do not fake it. |
<changefreq> |
Optional | always / hourly / daily / weekly / monthly / yearly / never | Hint for how often content changes. Treated as a hint, not a directive. |
<priority> |
Optional | 0.0 – 1.0 | Relative priority within your site. Default: 0.5. Does not affect ranking vs. other sites. |