Example output: examples/seo-drift-wix-com-20260514/compare/DRIFT-REPORT.md
SEO Drift
Git for SEO. Capture a snapshot of a domain or URL's SEO state ("baseline"), then on later runs diff the current state against the baseline and surface regressions. Catches the things that get worse silently after a deploy, redesign, or content cull.
Acknowledgements: drift-as-an-SEO-skill framework originated in
claude-seoby AgriciDaniel (with the original concept credited to Dan Colta, Pro Hub Challenge). MIT-licensed both directions; this implementation is independent but the framing is theirs.
Prerequisites
- SE Ranking MCP server connected.
- Claude's
WebFetchtool available (for URL-mode page fingerprinting). - User provides: target domain or URL, plus a subcommand (
baseline,compare,history).
Optional flags
| Flag | Mode | Effect |
|---|---|---|
--no-firecrawl | baseline, compare | Skip Firecrawl-based <head> + JSON-LD capture even when Firecrawl is installed (saves credits at the cost of canonical / robots / og:* / JSON-LD diff coverage). |
--skip-cwv | baseline, compare | Skip the Google CrUX capture (step 4b) even when google-api.json is configured. Useful when you only care about content/structural drift, or when CrUX rate-limit concerns outweigh CWV coverage. Mirrors theirs at seo-drift/SKILL.md:107, 131. |
--baseline-id <n> | compare | Compare against a specific baseline by ID rather than the most recent. |
--limit <n> | history | Cap the number of historical entries shown. |
Subcommands
baseline <target>
Capture the current SEO state and write it to a snapshot file. No diff produced.
compare <target>
Load the most recent baseline for the target. Capture the current state. Diff. Produce DRIFT-REPORT.md.
history <target>
List all stored baselines for the target with their dates and key metrics (DA, traffic, keyword count). No diff produced.
Process
baseline mode
- Validate target. Determine if domain or URL. Domain =
example.com; URL = anything starting withhttp(s)://.- SSRF protection (URL mode). If target is a URL, validate via
python3 -c "from scripts.google_auth import validate_url; import sys; sys.exit(0 if validate_url('{target}') else 1)"(or importvalidate_urldirectly in any helper script). Reject loopback (127.0.0.1, ::1, localhost), private IP ranges (10/8, 172.16/12, 192.168/16), link-local (169.254/16), and Google metadata endpoints. If validation fails, abort with a clear error and don't proceed to fetch — feeding an unvalidated URL into Firecrawl / WebFetch / Google APIs would risk SSRF against internal services. Mirrors theirs atseo-drift/SKILL.md:97.
- SSRF protection (URL mode). If target is a URL, validate via
- Preflight. See
skills/seo-firecrawl/references/preflight.mdfor the canonical 3-stage preflight (credit balance, Firecrawl availability, Google APIs). Skill-specific notes:- Estimated SE Ranking cost for this skill: typical baseline costs ~10–20 SE Ranking credits depending on whether step 4 (URL-mode page snapshot) is included.
- Firecrawl: optional with WebFetch fallback, +1 Firecrawl credit per URL if available (URL mode). When available, the snapshot also captures
<head>+ JSON-LD content so canonical / robots / og:* / JSON-LD changes are detectable on diff. Without it the snapshot is partial. Pass--no-firecrawlto skip Firecrawl even when available (saves credits at the cost of diff coverage). - Google APIs: tier 0 unlocks CrUX p75 LCP/INP/CLS capture (origin in domain mode, URL in URL mode); tier 1 (URL mode only) additionally captures URL Inspection state (
indexStatusVerdict,googleCanonical,lastCrawlTime) so subsequent compares can flag field-data and indexation drift. Seeskills/seo-google/references/cross-skill-integration.md§ "seo-drift" for the full recipe.
- Domain snapshot (always):
DATA_getDomainOverviewWorldwide— DA, traffic, organic + paid keyword counts.DATA_getDomainKeywords— top 100 organic keywords with positions.DATA_getBacklinksSummary— backlinks total, referring domains total.DATA_getBacklinksRefDomains— top 20 referring domains with authority.
- Page snapshot (if target is a URL):
WebFetch(always) +mcp__firecrawl-mcp__firecrawl_scrape(when available)- WebFetch (free): extract
<title>, all<h1..h6>, lang, word count, internal-link count, image count, body markdown for prose-level diff. - Firecrawl (1 Firecrawl credit per URL) — recovers
<head>and<script>content WebFetch strips:- From
metadata: canonical URL, robots meta, og:title, og:description, og:image, twitter:card. - From returned
html: every<script type="application/ld+json">block. Capture both detected@types and a hash of the full block content (so any schema-content change is detected on diff, not just type-list changes).
- From
- If Firecrawl unavailable (or
--no-firecrawlpassed): only WebFetch fields enter the fingerprint.BASELINE.mdnotes:Snapshot fields recovered via WebFetch only — canonical, robots, og:*, twitter:*, and JSON-LD changes will not be detected on subsequent compares. Install Firecrawl for full coverage. - Compute a fingerprint hash of the captured fields.
- Also capture page authority:
DATA_getPageAuthority. 4b. Google field-data snapshot (only if google-api.json is present AND--skip-cwvnot set) - Tier 0 (always when configured):
python3 scripts/pagespeed_check.py "{target}" --crux-only --json(URL mode) orpython3 scripts/pagespeed_check.py "https://{domain}" --crux-only --json(domain mode, origin-level CrUX). Store the resulting p75 LCP / INP / CLS / FCP / TTFB and the source label ("URL" or "origin"). - Tier 0 (always when configured):
python3 scripts/crux_history.py "{target_or_origin}" --jsonfor the 25-week trend window snapshot — store ascrux_history_baseline. Subsequent compares can detect drift against the most recent week. - Tier 1 (URL mode only):
python3 scripts/gsc_inspect.py "{target_url}" --site-url "{config.default_property}" --json. StoreindexStatusVerdict,coverageState,googleCanonical,userCanonical,lastCrawlTime. - If
--skip-cwvwas passed, skip this step entirely and storenullforcwv/crux_historyfields. The compare-mode rules then surface "Field-data drift: skipped —--skip-cwvflag passed at baseline." - If CrUX returns insufficient data, store
nullfor the affected metrics and continue.
- WebFetch (free): extract
- Write snapshot file
seo-drift-{target-slug}-{YYYYMMDD}/snapshot.json. - Update index
seo-drift-{target-slug}/baselines.json— append{date, snapshot_path}entry.
compare mode
- Validate target + locate latest baseline in
seo-drift-{target-slug}/baselines.json.- SSRF protection (URL mode). Same
validate_url()check as baseline mode. Refuses to fetch private/loopback/metadata addresses. - If no baseline exists, fall through to baseline mode and tell the user to come back later.
- SSRF protection (URL mode). Same
- Capture current state (same data as baseline mode).
- Diff each metric using
references/drift-thresholds.md:- Domain authority: ±5 = yellow, ±10 = red.
- Estimated organic traffic: ±20% = yellow, ±50% = red.
- Organic keyword count: ±10% = yellow, ±30% = red.
- Top-3 keyword count: ±15% = yellow, ±40% = red.
- Top-100 keyword churn: any high-volume drop = red.
- Net referring domains: -5 to -20 = yellow, <-20 = red.
- Page-level (URL mode): any change to canonical / robots / lang / H1 = red; title or meta description change = yellow; schema types added/removed = yellow; og:* / twitter:* changes = yellow.
- Firecrawl-dependent diff caveat: canonical / robots / og:* / twitter:* / JSON-LD diffs require both baseline and current snapshots to have been captured with Firecrawl. If either snapshot was WebFetch-only, those fields surfac