Example output: examples/seo-geo-notion-share-pages-20260514/GEO.md
Page-Level GEO (Generative Engine Optimization)
For one URL, surface its AI-search citation footprint and recommend the page-level changes that would improve citability across AI Overview, Perplexity, ChatGPT, and other LLM-powered search engines. Different from the domain-level brand-vs-brand share-of-voice — this is page-level diagnosis.
Prerequisites
- SE Ranking MCP server connected.
- Claude's
WebFetchtool available. - User provides: a target URL. Optional: target country (default
us), specific keywords to focus on (defaults: the URL's top-5 traffic-weighted keywords from SE Ranking).
Process
-
Validate target & preflight. See
skills/seo-firecrawl/references/preflight.mdfor the canonical 3-stage preflight (credit balance, Firecrawl availability, Google APIs). Skill-specific notes:- Confirm URL is fetchable before continuing.
- Estimated SE Ranking cost for this skill: ~10–20 credits typical (URL keyword footprint, AIO presence + leaderboard for top 5 keywords).
- Firecrawl: optional, ~3 Firecrawl credits if available. When available, the JSON-LD parse in step 7 and the AI-protocol-files step 8 use it. Without it, those steps emit
(skipped — Firecrawl not installed; install via extensions/firecrawl/install.sh)notes inGEO.mdrather than failing the run. Pass--no-firecrawlto skip Firecrawl even when available (saves credits). - Google APIs: not used.
-
URL keyword footprint
DATA_getUrlOverviewWorldwideandDATA_getDomainKeywords(URL-filtered)- Pull URL's overview (keywords, traffic).
- Pull all keywords the URL ranks for. Sort by traffic-weighted score.
- Take the top 5 as the GEO investigation set (or use user-supplied keywords).
-
AIO presence per keyword
DATA_getAiOverview- For each keyword, query AIO presence + citation list.
- Flag: AIO present? Is the candidate URL cited?
- Capture the AIO answer text — it tells you what passage shape Google's models prefer.
-
AIO leaderboard per keyword
DATA_getAiOverviewLeaderboard- Full ranked list of cited sources per AIO query.
- Identify patterns: domain-level (which sites consistently cited?), passage-level (what structure?).
-
Page passage-level audit
WebFetch- Pull the page HTML.
- Identify "passages" — paragraphs that could be extracted standalone (TL;DR boxes, definition paragraphs, summary sentences after H2s).
- For each passage, score citability:
- Has it a complete thought in 1–3 sentences?
- Does it answer a specific question (i.e., the question its parent H2 implies)?
- Has it a stat / number / named entity?
- Has it a clear timestamp or freshness signal?
- This is the citability layer.
-
Compare candidate to cited sources
- For each AIO query where candidate is NOT cited, identify the cited sources.
- WebFetch 2–3 of them.
- Extract the cited passage (often a snippet from the AIO answer).
- Compare passage shape: candidate vs cited. Surface specific structural / content / freshness gaps.
-
Schema check
mcp__firecrawl-mcp__firecrawl_scrape- WebFetch in step 5 returned markdown — JSON-LD blocks were stripped before parsing. The schema check requires Firecrawl to recover them.
- If Firecrawl available: scrape the target URL once (1 Firecrawl credit), parse the returned
htmlfor every<script type="application/ld+json">block. Specifically check for:Article/BlogPostingwith validauthor+datePublished+dateModified;FAQPageif Q&A blocks present;BreadcrumbList;mainEntityOfPageself-canonical. - If Firecrawl unavailable: write
Schema check: skipped — Firecrawl required to parse JSON-LD blocks (WebFetch returns markdown only).intoevidence/06-schema-check.md, mirror the same line in the GEO.md "Schema check" section. Don't infer from markdown — that's the bug this section closes. - Schema isn't a direct citation signal but it correlates strongly with citation rates in Google's AIO.
-
AI-protocol files
mcp__firecrawl-mcp__firecrawl_scrape- If Firecrawl available: scrape
https://{domain}/llms.txtandhttps://{domain}/.well-known/rsl.json(and the legacy/RSL.txtlocation as a fallback). Cost: 2 Firecrawl credits (one per file). - For each file: capture HTTP status (200 / 404 / other), full body if present, and a parsed summary (declared content categories, allow/deny scope, attribution requirements).
- Surface in
evidence/07-ai-protocol-files.mdand in GEO.md as a new "AI-protocol files" section. These signal the domain's stance on LLM training and citation — present-and-permissive correlates with higher AIO citation rates. - If Firecrawl unavailable: write
AI-protocol files: skipped — Firecrawl not installed.Don't fall back to WebFetch (it would work for plain text but the integration stays uniform; runtime savings are negligible).
- If Firecrawl available: scrape
-
Synthesise
GEO.md
Output format
Create a folder seo-geo-{target-slug}-{YYYYMMDD}/ with:
seo-geo-{target-slug}-{YYYYMMDD}/
├── GEO.md (synthesised report + recommendations — primary deliverable)
├── 04-page-passages.md (extracted passages + citability scores — load-bearing reference editors consult)
├── 05-cited-source-comparison.md (gap vs cited sources — load-bearing reference)
└── evidence/
├── 01-url-keyword-footprint.md (URL overview + top keywords — raw step output)
├── 02-aio-by-keyword.md (AIO presence + citation per keyword)
├── 03-leaderboards.md (full leaderboards per keyword)
├── 06-schema-check.md (JSON-LD audit for GEO-relevant types — requires Firecrawl)
└── 07-ai-protocol-files.md (llms.txt + RSL status and content — requires Firecrawl)
Top-level: GEO.md + 04-page-passages.md + 05-cited-source-comparison.md. The other step files preserve raw API/scrape outputs in evidence/ for reproducibility — editors and writers don't open them in the normal flow.
GEO.md follows this shape:
# GEO Analysis: {URL}
> Snapshot dated {YYYY-MM-DD} · Country: {country} · Keywords analysed: {n}
## Citation footprint
| Keyword | AIO present | Candidate cited | Citers |
|---|---|---|---|
| {keyword 1} | ✓ | ✗ | {3 cited sources} |
| {keyword 2} | ✓ | ✓ | {includes candidate + 2 others} |
| ... |
**Citation rate: {n}/{checked} ({%}) of AIOs where candidate could appear actually cite it.**
## Where the candidate IS cited
- {keyword X} — passage cited: "{passage text}"
- ...
## Where the candidate is NOT cited (and AIO is present)
- {keyword Y} — cited sources tend to share these patterns:
- {pattern 1: short definitive answer in first 100 words}
- {pattern 2: numbered stat with date}
- {pattern 3: schema-marked Article with author bio}
- The candidate is missing: {specific gap}.
## Page passage-level audit
Top-scoring passages on the candidate (by citability score):
1. {passage at H2 "X" — score 8/10. Strong: definitive sentence, named stat. Weak: no date.}
2. ...
Lowest-scoring passages (refresh candidates):
1. {passage at H2 "Y" — score 3/10. Weak: vague generalities, no specific data.}
2. ...
## Schema check
- `Article` (or sub-type) present and valid: {✓/✗ | skipped — Firecrawl required}
- `author` populated with `@type: Person` and `url`: {✓/✗}
- `datePublished` + `dateModified` ISO 8601: {✓/✗}
- `FAQPage` for visible Q&A: {✓/✗/N-A}
- `BreadcrumbList`: {✓/✗}
## AI-protocol files
- `/llms.txt` present: {✓ status 200 / ✗ status {n} / skipped — Firecrawl required}
- `/.well-known/rsl.json` (or `/RSL.txt`) present: {✓ / ✗ / skipped}
- Stance summary: {permissive / restrictive / mixed / unknown — based on declared categories and allow/deny scope}
## Recommendations (top 5 to improve citability)
1. {Specific change — e.g., "A