Enter a URL above and click Analyze to start the SEO analysis.
Run analysis to see SEO score.
Run analysis to see GEO score (AI readiness).
Run analysis to see HTTP headers.
Run analysis to see SSL certificate details.
Run analysis to see redirect chain.
Run analysis to see performance data.
Run analysis to see meta tags and headings.
Run analysis to see link analysis.
Run analysis to see image analysis.
Run analysis to see structured data.
Run analysis to see content metrics.
Run analysis to see the HTML5 document structure.
Run analysis to see accessibility checks.
Run analysis to see robots.txt data.
Run analysis to view HTML source.
A practical reference of SEO best practices, organized by category. Independent of any analysis — always available.
- HTTPS everywhere — Serve all pages over HTTPS. HTTP→HTTPS redirects must be permanent (301/308).
- Single canonical URL — Use
<link rel="canonical">on every page. Avoid duplicate content via URL parameters, trailing slashes, or www/non-www. - Crawlable robots.txt — Keep
robots.txtaccessible at the root. Declare your Sitemap URL there. Avoid accidentally blocking Googlebot. - XML Sitemap — Submit to Google Search Console. Include only canonical, indexable URLs. Update automatically.
- Redirect chains — Keep redirect chains to 1 hop. Chains of 3+ slow crawling and lose link equity.
- HTTP/2 or HTTP/3 — Enables request multiplexing; improves performance for pages with many resources.
- Security headers — HSTS, CSP, X-Frame-Options, X-Content-Type-Options signal a well-maintained site.
- Structured data (JSON-LD) — Mark up key entities (Article, Product, FAQPage, BreadcrumbList) for rich results in SERPs.
- hreflang — For multi-language sites, include self-referencing hreflang tags and an
x-defaultfallback. - Avoid noindex in production — Double-check that staging/dev
noindexheaders are not deployed to production.
- Title tag — 50–60 characters, primary keyword near the start. Unique per page. Avoid keyword stuffing.
- Meta description — 70–160 characters. Compelling summary that increases click-through rate. Not a direct ranking factor, but indirectly important.
- Single H1 — One H1 per page, containing the primary topic. H2–H6 should follow a logical hierarchy.
- Word count — Aim for 300+ words for indexable pages. Thin content (<100 words) is rarely ranked.
- Readability — Flesch-Kincaid score ≥ 60 (standard) is a good target for general audiences. Short paragraphs and sentences help.
- Keyword placement — Include primary keyword in title, H1, first paragraph, and naturally in body. Avoid repetitive exact-match stuffing.
- Lead with the answer — Place the core statement in the first 2–4 sentences of every section. This pattern is favored by classic Featured Snippets, Google AI Overviews, and LLM citation extractors alike — see the GEO Guide for details.
- Fresh content — Update evergreen content regularly. Add a
dateModifiedin JSON-LD to signal freshness to crawlers. - Image alt text — Every meaningful image needs descriptive alt text. Decorative images:
alt="". - Open Graph + Twitter Card — Required for controlling how pages appear when shared on social media. Minimum: title, description, image (1200×630 px).
- Descriptive anchor text — Use meaningful text like "SEO best practices" instead of "click here". Helps both users and crawlers understand link context.
- Internal linking — Link related pages from within your content. Spreads PageRank and helps crawlers discover pages.
- External links — Link out to authoritative sources. Use
rel="nofollow"orrel="sponsored"for paid/UGC links. - Broken links — 404 links waste crawl budget and degrade user experience. Audit regularly with a crawler.
- Link depth — Important pages should be reachable within 3 clicks from the homepage. Deep pages are crawled less frequently.
- Backlinks — Links from authoritative, topically relevant domains are a major ranking signal. Focus on quality over quantity.
- Disavow sparingly — Only disavow links if you have evidence of a manual action or clearly toxic pattern. Do not disavow indiscriminately.
- llms.txt — Add an
/llms.txtfile (plain text, Markdown format) to help AI systems understand what your site offers and how it should be cited. See llmstxt.org. - AI Overview optimization — Google's AI Overviews often cite pages with clear, structured answers, concise prose, and authoritative signals (author, date, organization schema).
- Entity clarity — Name entities explicitly (organizations, people, products) and mark them up with Schema.org. This helps knowledge graph inclusion and AI citation.
- Conversational queries — AI-driven search handles natural language questions. Optimize for "who, what, where, why, how" question patterns.
- Robots.txt for AI crawlers — Common AI crawlers:
GPTBot,Claude-Web,PerplexityBot,CCBot. Allow or block per use case. - Perplexity / ChatGPT citations — These tools cite readable, well-structured HTML pages. Avoid heavy JavaScript-only rendering.
- AI-generated content — Provide value beyond what AI generates. Differentiate with original data, personal expertise, and direct experience (Google's "E-E-A-T").
- E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) — Key quality signals for Google's Quality Raters. Use author schema, About/Contact pages, and editorial standards.
| Metric | Good | Needs work | Poor | What it measures |
|---|---|---|---|---|
| LCP (Largest Contentful Paint) | ≤ 2.5 s | ≤ 4 s | > 4 s | Time until the largest visible element is rendered |
| INP (Interaction to Next Paint) | ≤ 200 ms | ≤ 500 ms | > 500 ms | Responsiveness to user interactions |
| CLS (Cumulative Layout Shift) | ≤ 0.1 | ≤ 0.25 | > 0.25 | Visual stability — unexpected layout shifts |
- Improve LCP — Preload the LCP image (
<link rel="preload" as="image">). Use a CDN. Eliminate render-blocking resources. Upgrade to HTTP/2. - Improve INP — Minimize long JavaScript tasks (>50 ms). Use
requestIdleCallbackfor non-critical work. Avoid synchronous XHR. - Improve CLS — Set explicit
widthandheighton all images and iframes. Avoid inserting DOM elements above existing content without user interaction. - TTFB — A server-side metric: < 200 ms is excellent, < 500 ms is acceptable. Use caching, CDN edge nodes, and faster server responses.
- Measure — Use PageSpeed Insights, web.dev/measure, or Chrome DevTools Performance panel.
A practical reference for Generative Engine Optimization (GEO) — optimizing your content for AI-powered search engines and LLM-based answers. Independent of any analysis — always available.
Generative Engine Optimization (GEO) is the practice of optimizing web content so it gets cited and referenced in AI-generated answers — from tools like ChatGPT, Google Gemini, Perplexity, and others.
Unlike traditional SEO, which focuses on ranking positions in search result pages, GEO focuses on citability — being the source that AI systems reference when answering user questions.
| Aspect | SEO (Classic) | GEO (AI Optimization) |
|---|---|---|
| Goal | Ranking in SERPs | Citation in AI answers |
| KPI | Position / CTR | AI-Citation Rate |
| Content | Keyword-focused | Structured, original, authentic |
| Technical | Basic markup sufficient | Comprehensive schema markup |
| Authority | Backlinks | Original data & studies |
| Format | Text-oriented | Multimodal (text, graphics, video) |
In May 2026, Google published an official AI Optimization Guide. The headline message: SEO and GEO are the same discipline — Google's AI Overviews and Gemini in Search use the classic Search index via Retrieval-Augmented Generation (RAG), so no AI-specific re-tooling is required for Google.
This is Google's view of Google AI Search only. Other AI engines (Perplexity, ChatGPT, Claude, Bing Copilot) weight signals differently — that's why the GEO Score in this tool still rewards llms.txt and rich Schema markup. The recommendations below describe the Google perspective; the other accordion sections cover broader, cross-engine GEO practice.
What Google says you DO need
- Indexable & crawlable — Pages must be eligible to appear in Google Search with a snippet. If a page can't show as a regular result, it can't be cited in AI Overviews either.
- Non-commodity content — First-hand experience, original perspective, real expertise. Generic "7 tips" listicles lose against authentic content.
- Clear organisation — Paragraphs, sections, descriptive headings, high-quality images and videos. Semantic HTML is recommended (not strictly required).
- JavaScript best practices — Content inside JS is processable as long as it isn't blocked from rendering. Heavy client-only rendering still hurts.
- Page experience — Responsive design, low latency, content distinct from boilerplate.
- Minimise duplicates — Duplicate content wastes crawl budget that AI features rely on.
- Local / Ecommerce details — Google Business Profile and Merchant Center feeds for relevant businesses; "Business Agent" for conversational interactions.
What Google explicitly says is not required
This is Google's own mythbusting list. None of these are necessary to appear in Google AI Search:
| Common GEO claim | Google's position (quoted) |
|---|---|
llms.txt file |
"You don't need to create new machine readable files, AI text files, markup, or Markdown to appear in generative AI search." |
| Schema.org / structured data | "Structured data isn't required for generative AI search, and there's no special schema.org markup you need." |
| Content chunking for AI | "There's no requirement to break your content into tiny pieces for AI to better understand it." |
| AI-specific rewrites | "You don't need to write in a specific way just for generative AI search." |
| Long-tail keyword variations | "You don't have to worry that you don't have enough 'long-tail' keywords." |
| Inauthentic brand "mentions" | "Seeking inauthentic 'mentions' across the web isn't as helpful as it might seem." |
llms.txt and Schema as ranking signals in practice.
AI systems prefer content that is clear, structured, and factual. Generic, bloated text walls are rarely cited.
- Short paragraphs — Keep paragraphs under 150 words. AI models extract information more easily from concise blocks.
- Clear headings — Use descriptive H2–H6 headings that summarize the section content. AI uses these as navigational anchors.
- Facts, tables, and lists — Explicit data points, comparison tables, and structured lists are highly quotable by AI.
- Original data — AI heavily cites original sources: own studies, surveys, benchmarks, and unique datasets. Be the primary source.
- Authenticity over mass — Generic, AI-generated content is ignored. Show real expertise, personal experience, and unique insights.
- Conversational queries — Optimize for natural language questions: "who, what, where, why, how" patterns that AI users ask.
- FAQ sections — Dedicated FAQ blocks with clear Q&A format are directly extractable by AI systems.
AI Mode sessions are conversations, not single lookups. Google's May 2026 usage report shows the average AI Mode query is ~3× as long as a classic search query, and follow-up queries grew 40%+ per month in the US. Optimize for the whole session, not just the entry question.
- Anticipate the next question — After answering the main question, address the 2–3 most likely follow-ups in the same section or directly below. The natural extensions ("how much", "compared to what", "what about X") should not require a new page load.
- "See also" with descriptive anchors — When a follow-up genuinely needs its own page, link with anchor text that matches the follow-up phrasing ("How to configure X on macOS"), not generic ("read more"). AI follows these links to extend its answer.
- Re-state the subject per section — Multi-turn retrieval re-fetches sections in isolation. Avoid pronoun chains like "this approach" / "the tool" that point back across sections; name the entity each time the topic shifts.
- Refinement-friendly product data — Shopping sessions iterate over filters (price, location, color, availability). Expose these as
Product/Offerstructured data with explicit attributes so refinement queries can hit your page directly instead of a category sibling. - Decision queries deserve verdicts — Comparison questions (EN "Which …", DE "Welche …") favor pages with explicit criteria tables and a clear recommendation statement over open-ended prose. Lead each comparison with a one-sentence verdict, then justify.
- Planning queries deserve checklists — Travel, finance, and training-plan queries grew ~80% faster than the AI Mode average. Numbered steps, week-by-week structures, and reusable templates are more citable than essay-style planning advice.
Concrete writing patterns that make content easier for LLMs to extract, attribute, and cite. Apply these per section, not per page.
- Lead with the answer — Place the core statement in the first 2–4 sentences of every section. Precise, factual, no filler. AI systems quote what they can extract first; burying the answer in paragraph four costs citations.
- Mini-definitions — Explain key terms in 1–2 sentences as a self-contained block (after a heading, in a
<dl>, or as a glossary entry). LLMs prefer quotable micro-snippets over definitions buried in prose. - Quantitative evidence — Use concrete numbers, percentages, dates, study references. "Loads in 1.8 s on 4G" beats "loads quickly". Specific, verifiable claims outrank vague statements in AI answers.
- High semantic density — One idea per sentence, no padding. Avoid SEO-era word-count inflation — LLMs penalize fluff with lower extraction confidence.
- Self-contained blocks — Each section should make sense without surrounding context. AI extracts paragraphs and lists in isolation, so re-state the subject when a new section starts; avoid anaphora like "this approach" pointing back across sections.
- Consistent entity naming — Use the exact same brand, product, and term names throughout. Variations ("our app" / "the platform" / "this tool") confuse entity recognition and dilute attribution.
- Quotable formats — Comparison tables, numbered steps, FAQ blocks, definition lists, mini case studies. These are the formats LLMs prefer to lift verbatim.
AI models rely on structured, machine-readable content. Unstructured text blocks are increasingly ignored.
- Schema Markup (JSON-LD) — Use
FAQPage,HowTo,Article,Review,Productschemas. These feed directly into AI knowledge extraction. - Organization & Person schema — Identify who is behind the content. AI uses this for E-E-A-T verification and citation attribution.
- Tables with
<th>headers — Properly structured HTML tables are machine-readable and highly quotable. - Definition lists —
<dl>/<dt>/<dd>for glossaries and key-value explanations. - llms.txt — Add an
/llms.txtfile (Markdown format) at your domain root. It helps AI systems understand your site's purpose and content structure. See llmstxt.org. - Modular content — Design content in self-contained blocks that AI can extract independently without losing context.
E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) is critical for AI visibility. Without verifiable authority, even high-quality content stays invisible.
- Author pages — Create dedicated author pages with credentials, bio, social links, and published works. Use
Personschema. - Original research — Publish industry studies, surveys, and unique data. AI systems strongly prefer citing original sources.
- Expert content — Guest posts, podcast appearances, interviews, and speaking engagements build recognizable expertise signals.
- Reviews & testimonials — Customer reviews, case studies, and testimonials provide social proof that AI recognizes.
- About & Contact pages — Clear organizational identity with verifiable contact information builds trust.
- Editorial standards — Transparent sourcing, fact-checking processes, and editorial policies signal reliability.
- AI Crawler management — Common AI crawlers:
GPTBot,ChatGPT-User,OAI-SearchBot,Google-Extended,Claude-Web,ClaudeBot,PerplexityBot,CCBot,Bytespider. Allow or block selectively inrobots.txt. - llms.txt file — Place at domain root (
/llms.txt). Describe your site, key content areas, and how AI should cite your content. Currently experimental but gaining adoption. - Server-side rendering — AI crawlers struggle with heavy JavaScript-only sites. Ensure critical content is in the initial HTML response.
- Fast load times — AI crawlers respect crawl-delay and skip slow sites. Optimize TTFB, enable compression, use CDN.
- Clean URL structure — Logical, descriptive URLs help AI systems categorize content topically.
- Internal linking — Strong internal link structure helps AI understand your topical authority and content hierarchy.
AI systems decompose user questions into multiple sub-queries (Query Fan-Out). Sites that comprehensively cover a topic are favored.
Google officially confirms this technique: the system generates "concurrent, related queries" to fetch additional results that address user intent — alongside Retrieval-Augmented Generation (RAG) for assembling the final answer.
- Topic clusters — Build pillar pages with comprehensive subtopic coverage. Link related content together to demonstrate depth.
- Entity coverage — Explicitly name and describe all relevant entities (products, methods, people, places, studies). Use Schema.org markup.
- Multi-format content — Cover topics across text, video, infographics, and podcasts. AI aggregates from multiple formats.
- Fan-out analysis — For key topics, identify all possible sub-questions a user might ask. Create content that answers each one.
- Content depth — Aim for 500+ words on key pages. Thin content (<300 words) is rarely cited by AI.
Text-only queries are no longer the default. Google reports that more than 1 in 6 AI Mode queries in the US arrive without text — via voice, image, video, or real-time conversation (May 2026). Visual queries in particular are growing fast: image-generation prompts more than tripled in early 2026. Sites that exist only as walls of text become invisible to this slice of traffic.
- Image search & visual entry points — Provide high-resolution, well-cropped images for every key topic. Use descriptive filenames, meaningful
alttext, andImageObjectschema (caption,contentUrl,creditText,license). Visual searches need a back-reference to attribute citations — without schema, the image gets shown, the source does not. - Video with transcripts, not video alone — Embed videos with the full text transcript on the same page. Use
VideoObjectschema (name,description,thumbnailUrl,uploadDate,transcript,hasPartfor chapters). LLMs cannot watch video — the transcript is what gets quoted. - Audio summaries & podcasts — Episode pages need show notes plus a searchable transcript, marked up with
PodcastEpisodeorAudioObjectschema. AI Mode increasingly generates audio answers; being a quoted source requires extractable text in the surrounding page. - Voice-friendly phrasing — Voice queries are longer and more conversational than typed ones. Lead each section with the answer in one declarative sentence, then expand. TTS engines often read only the first 1–2 sentences of a citation aloud — bury the answer and you lose the slot.
- Real-time / conversational input — New entry modes (Search Live, Lens overlays) send mixed text+image queries. Make sure on-page concepts are named in text near the image, not just shown visually, so multimodal models can ground the image against your wording.
- Visual identity consistency — Same logo, product imagery, and brand colors across owned and third-party platforms. Visual entity recognition treats matching imagery as a brand signal — drifting visuals dilute attribution the same way drifting brand names do.
ImageObject schema, (b) a video or audio asset with a text transcript on the same URL, and (c) a one-sentence lead answer that reads cleanly out loud? If not, it's text-only and competes for a shrinking share of AI search.
Visibility is no longer limited to Google. AI systems aggregate information from across the web.
- Platform presence — Be present on YouTube, Reddit, TikTok, LinkedIn, and industry forums. AI crawls these for information.
- Social proof — Reviews, mentions, and discussions on third-party platforms increase your AI citation likelihood.
- Consistent identity — Use the same brand name, descriptions, and key messages across all platforms for entity recognition.
- Community engagement — Active participation in relevant communities (Reddit, Stack Overflow, industry forums) builds mention-based authority.
Classic SEO KPIs (rank, CTR, sessions) don't capture AI visibility. GEO needs its own measurement setup — built around citations and brand mentions in AI-generated answers, not SERP positions.
Setup: the question catalog
- 30–50 strategic questions — Define the queries your target users actually ask AI tools about your topic. Mix transactional, informational, and comparison questions.
- Test across engines — Run the catalog regularly against ChatGPT, Google Gemini, Perplexity, Claude, and (where relevant) Bing Copilot. Each engine weights sources differently.
- Track over time — Re-run the catalog monthly. Capture answers verbatim — wording shifts reveal how AI perception of your brand is evolving.
Quantitative KPIs
- Brand mention rate — % of catalog questions where your brand appears in the answer (named, not just linked).
- Citation count & quality — How often your domain is cited as a source, and on which questions. Citations on high-intent queries are worth more than on broad informational ones.
- Share of Voice — Your mention frequency vs. direct competitors across the catalog.
- Brand search trend — Branded queries in Google Search Console / Trends. AI exposure typically lifts branded search before it lifts referral traffic.
- AI referral traffic — Sessions from
chat.openai.com,perplexity.ai,gemini.google.com,copilot.microsoft.com, etc. Filter by referrer in your analytics.
Qualitative signals
- Accuracy — Does the AI describe your product / service correctly? Misattributions are red flags worth fixing at the source.
- Positioning — Is the brand's USP mentioned, or is it described as a generic alternative?
- Context — Is your brand cited in the answer body, or only as a footnote link? Inline mentions outperform link-only citations.
Search engines are evolving into AI agents that autonomously research, compare, and prepare decisions on behalf of users.
Google's AI Optimization Guide describes browser agents that "access your website to gather data" by analysing three layers in parallel:
- Visual rendering — The rendered page as a user would see it. Hidden critical content (behind tabs, lazy-loaded after long scrolls) may be missed.
- DOM structure — The full HTML tree. Semantic elements (
<article>,<nav>,<main>,<header>) help agents identify content vs. chrome. - Accessibility tree — The same tree screen readers use. ARIA labels, proper landmark roles, and meaningful
alttext become agent-discoverability signals, not just accessibility compliance. Watch for emerging protocols like Universal Commerce Protocol (UCP).
Practical takeaway: the Accessibility tab of this analyzer is no longer just a WCAG checklist — its findings directly affect how well autonomous agents can extract structured information from your site.
- Agent Optimization (AO) — A new discipline emerging beyond GEO. Optimizing for autonomous agents that evaluate and select sources without human intervention.
- Machine-readable data — APIs, structured data feeds, and clean data exports will become essential for agent accessibility.
- Trust signals — Verifiable credentials, consistent track records, and transparent provenance will be mandatory for agent trust.
- Decision-ready content — Content must provide clear comparisons, recommendations, and actionable conclusions that agents can directly use.
- Early mover advantage — Sites that structure data and build authority now will have a significant lead when AI agents become mainstream.
As AI crawlers increasingly access web content, publishers need a standardized way to declare how automated systems may use their content. Content Signals is Cloudflare's implementation of a new Content-Signal directive for robots.txt that goes beyond simple Allow/Disallow rules.
The three Content-Signal categories
The Content-Signal directive uses three categories, each set to yes or no:
| Signal | Meaning |
|---|---|
ai-train |
Training or fine-tuning AI models on your content. |
search |
Building a search index and providing search results (hyperlinks and short excerpts). Does not include AI-generated search summaries. |
ai-input |
Inputting content into AI models in real-time (e.g. retrieval augmented generation, grounding, or AI-generated search answers). |
Example usage in robots.txt:
User-Agent: *
Content-Signal: ai-train=no, search=yes, ai-input=no
Allow: /
Four default policies
- Disallow All — Most restrictive. No access for any purpose. May cause search engines to exclude your site.
- Allow Search Only — Permits search indexing and results, but no AI training or AI input.
- Allow Search & AI Input — Permits search and real-time AI usage (e.g. AI search answers), but no model training.
- Allow All — Permits search, AI input, and AI training.
Advanced: per-path and per-bot rules
Content Signals support path-specific rules (e.g. allow /blog/ for search only, but /about for everything) and user-agent targeting (e.g. different rules for googlebot vs. bingbot).
Why does this matter for GEO?
- Strategic visibility — Allow
ai-input(AI citation in search answers) while blockingai-train— maximize visibility without giving away training data. - Content control — Explicitly declare your AI preferences instead of relying on AI companies to respect informal requests.
- EU rights reservation — Content Signals include an explicit reservation of rights under Article 4 of EU Directive 2019/790 (Copyright in the Digital Single Market).
- Early adoption — As more AI systems honor Content Signals, early adopters will have established clear permission records.
Generate your Content Signals
Use the Content Signals Generator to create your robots.txt directives. Choose one of the four default policies, customize per category, and copy the generated output directly into your robots.txt file.
Import JSON
Load a previously exported JSON file to re-render all analysis tabs — no new network request needed.