Example output: examples/seo-hreflang-airbnb-com-20260514/HREFLANG-REPORT.md
Hreflang Audit
Adapted from AgriciDaniel/claude-seo's seo-hreflang skill (MIT). Concept and validation rules originate there; this implementation is rebuilt against our backend (SE Ranking MCP + Firecrawl + WebFetch + Google APIs via seo-google).
Validate hreflang implementations on a multi-language or multi-region site. Surface SE Ranking audit-level hreflang issues, parse a sample of pages for the actual <link rel="alternate" hreflang="…"> tags they emit, cross-check the sitemap, and produce one of three verdicts — PASS, NEEDS-FIX, or BROKEN — with a top-fixes table anchored in objective signals.
Prerequisites
- SE Ranking MCP server connected.
- Claude's
WebFetchtool available (fallback when Firecrawl is unavailable). - User provides: a target domain (e.g.
example.com). Optional: explicit list of representative pages to inventory; explicit sitemap URL if not at/sitemap.xml. - Predecessor (recommended):
seo-technical-auditorseo-sitemapalready run on this domain. Without an existing audit, the skill creates one (which costs significantly more credits).
Process
-
Validate target & preflight. See
skills/seo-firecrawl/references/preflight.mdfor the canonical 3-stage preflight (credit balance, Firecrawl availability, Google APIs). Skill-specific notes:- Normalise domain (strip protocol, trailing slash) before continuing.
- Estimated SE Ranking cost for this skill: ~5–10 SE Ranking credits for re-using an existing audit (up to ~3 Firecrawl credits for the per-URL inventory).
- Firecrawl: optional with WebFetch fallback, ~6 Firecrawl credits if available (hard cap). When available, step 4 (per-URL hreflang inventory) runs on homepage + 5 representative pages with
formats: ["rawHtml"]. Without Firecrawl, step 4 falls back to WebFetch — coverage is degraded because WebFetch returns markdown only and silently strips<link rel="alternate">tags from<head>. Pass--no-firecrawlto force WebFetch even when Firecrawl is available. - Google APIs: tier 1 (GSC) unlocks step 6 (GSC verification of hreflang-targeted alternates). See
skills/seo-google/references/cross-skill-integration.mdfor the full enrichment contract.
-
Find or refresh the audit
DATA_listAudits→DATA_getAuditStatus- List audits for the domain. If a recent audit exists (<30 days), use it.
- If older than 30 days, run
DATA_recheckAuditand wait fordone. - If none exists, ask the user before creating one with
DATA_createStandardAudit— it consumes credits.
-
Pull SE Ranking's hreflang findings
DATA_getAuditReport+DATA_getAuditPagesByIssue- SE Ranking's audit catches hreflang errors directly — surface them first as ground truth.
- From
DATA_getAuditReport, extract every issue with code or category matchinghreflang(typical codes:hreflang_no_return_tag,hreflang_invalid_lang_code,hreflang_conflict,hreflang_missing_x_default,hreflang_canonical_mismatch,hreflang_no_self_reference). - For each significant hreflang issue (count ≥ 1), call
DATA_getAuditPagesByIssueto enumerate the affected URLs. - Persist to
01-audit-hreflang-issues.mdand feed intohreflang-issues.csv.
-
Per-URL hreflang tag inventory
mcp__firecrawl-mcp__firecrawl_scrape(preferred) /WebFetch(fallback)- Sample selection: homepage + up to 5 representative pages from
DATA_getDomainPages(sort by traffic descending; bias toward pages on different language paths if the URL structure exposes them —/en/,/fr/,/de/, etc.). - Firecrawl path (1 credit per URL, ~6 total): call
firecrawl_scrape(url=..., formats=["rawHtml"]). PinrawHtml— the defaulthtmlpost-processing strips<link rel="alternate">on many sites. Parse every<link rel="alternate" hreflang="…" href="…">from the<head>. Capture: source URL, hreflang attribute, href, and whether it's self-referencing. - WebFetch fallback (no Firecrawl): try fetching each URL and extracting hreflang from the markdown response. WebFetch frequently returns markdown that has stripped
<head>link tags, so this path will under-report. Note inHREFLANG-REPORT.md:Per-URL inventory: degraded coverage — Firecrawl not installed; some hreflang tags may be missed. - Apply validation rules (see references/validation-rules.md for the full list):
- Self-referencing tag: the page's own URL must appear in its own hreflang set.
- Return tags: every alternate link must reciprocate. If page A lists B as
fr, page B must list A asen(or whichever). - x-default: at least one alternate per set must use
hreflang="x-default". - Language-region code validation: every value must be a valid ISO 639-1 language (optionally followed by
-and an ISO 3166-1 Alpha-2 region). Common errors caught:eng(useen),jp(useja),en-uk(useen-GB),es-LA(no such ISO region). - Conflict detection: the same hreflang value (e.g.
de-DE) appearing on multiple distinct URLs is a conflict — Google ignores conflicting sets. - Canonical alignment: if the page has
<link rel="canonical">, it must match the page's own URL (or its self-referencing hreflang URL). Hreflang on a non-canonical page is silently ignored by Google. - Protocol consistency: all URLs in a set must share the same scheme (HTTPS preferred).
- Persist to
02-per-url-hreflang.mdand append findings tohreflang-issues.csv.
- Sample selection: homepage + up to 5 representative pages from
-
Sitemap-level hreflang (defer to
seo-sitemapwhere appropriate)- If the user's domain uses sitemap-based hreflang (
<xhtml:link rel="alternate" …>inside the sitemap), this skill checks structure and consistency only. Full sitemap analysis (orphans, missing pages, broken entries) isseo-sitemap's job — recommend it explicitly if a sitemap-vs-audit diff is in scope. - Fetch the sitemap. Try
https://{domain}/sitemap.xml; if 404, fetch/robots.txtand findSitemap:directives. For sitemap-of-sitemaps, recursively fetch each child. - Validate hreflang within the sitemap:
- Does the sitemap use the
xmlns:xhtml="http://www.w3.org/1999/xhtml"namespace? Required for hreflang in sitemaps. - Does each
<url>entry that has hreflang alternates include itself in the alternate set (self-reference)? - Does every alternate listed in one
<url>entry reciprocate as its own<url>entry with the same alternate set (return tags)? - Are language-region codes valid (apply same rules as step 4)?
- Does the sitemap use the
- Cross-check against per-URL inventory (step 4): if a sample URL's HTML lists 4 hreflang alternates but the sitemap entry for that URL lists 6, that mismatch is a conflict — Google may pick either, and inconsistency degrades the signal.
- Persist to
03-sitemap-hreflang.mdand append findings tohreflang-issues.csv.
- If the user's domain uses sitemap-based hreflang (
-
GSC verification of hreflang-targeted alternates (only if google-api.json is present, tier ≥ 1)
- For each unique domain that appears as an
hreftarget in the hreflang sets (e.g.example.com,example.de,example.fr), confirm GSC verification:python3 scripts/gsc_query.py --property "{property}" --json(a status-only check; just confirm the property responds withoutPROPERTY_NOT_VERIFIED). - Why this matters: Google explicitly recommends verifying every domain that participates in a cross-domain hreflang setup. If
example.deis listed as an alternate but isn't verified in this account, the hreflang signal is weakened and you can't see how Google interprets it. - Surface in
HREFLANG-REPORT.mdas a section "## GSC verification of hreflang targets" with one row per target domain:verified/not verified/not configured. - If property not verified for a target domain: list it as a fix at
- For each unique domain that appears as an