Deep SEO Audit Skill

Performs technical SEO audits for any URL — from a 2-minute focused check to a full 7-phase deep audit. Adapts to page type and user goal.

Runtime Requirement

This skill depends on Playwright MCP for rendered DOM access, screenshots, schema extraction, OG tag confirmation, and mobile-first evidence capture. Importing this skill does not install Playwright MCP automatically.

If Playwright MCP is unavailable, say so clearly before continuing and limit the audit to static-fetch checks where possible. Do not imply that rendered checks, screenshots, or JS-dependent schema/OG findings were verified when Playwright tools are unavailable.

Recommended install command:

claude mcp add playwright npx @playwright/mcp@latest -- --sandbox

This skill is written for Claude Code and can also be used by other agents that support the same tool surface. If an agent lacks Playwright MCP, WebFetch/WebSearch, or shell access, degrade to static checks only and say which phases could not be verified.

Bundled References

Load as needed:

references/page-type-signals.md — URL patterns, content signals, expected schema by page type. Read during Phase 1 if page type is ambiguous.
references/cwv-thresholds.md — Core Web Vitals thresholds, causes, reporting format. Read during Phase 3.
references/eeat-checklist.md — Full E-E-A-T + YMYL checklist. Read during Phase 6.
references/google-helpful-content.md — Google's official helpful content guidelines (Who/How/Why framework, people-first checklist, search-engine-first warning signs). Read during Phase 6 alongside the E-E-A-T checklist.
assets/report-template.md — Output report template for full audits.
scripts/lighthouse_audit.sh — Runs Lighthouse CLI for CWV. Use in Phase 3.
scripts/generate_docx.js — Converts structured audit JSON to professional .docx report. Uses docx-js. Run in Phase 8.

Phase 0: Scope Check (Always Run First)

Before fetching anything, ask one question — unless the user's message already makes it obvious:

"What's the goal of this audit — a quick check on something specific, or a full technical audit?"

Quick check examples (user mentions one thing — just do that thing, skip the rest):

"Is my schema correct?" → Phase 5 only
"Why is my page slow?" → Phase 3 only (run Lighthouse)
"Check my title and meta description" → Phase 4 title/meta only
"Is my page indexed?" → Phase 2 indexation only
"Audit my competitor's page" → Full audit but note competitive framing

Full audit — run all phases in order when:

User says "full audit", "complete audit", "deep dive", or "why isn't it ranking"
No specific concern mentioned alongside the URL
The user's goal is clearly holistic (pre-launch check, site migration, etc.)

Also ask (only if not obvious from context):

Do they have Google Search Console access? (improves depth of Phase 2)
Is there a specific competitor URL to benchmark against?

If the user just says "audit [URL]" with no other context, default to full audit and proceed.

Phase 1: Page Type Detection

Fetch the URL and classify the page type. This determines which checks to run and what schema to expect.

Primary fetch — try in this order:

WebFetch [URL] — fastest, works for SSR pages
If WebFetch returns mostly CSS or near-empty content (JS-rendered page):
- Try: curl -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" [URL]
- If still empty and Playwright MCP is available, use mcp__playwright__browser_navigate + mcp__playwright__browser_evaluate to render the page with JavaScript and extract the full DOM
- If still empty and Playwright MCP is unavailable, stop and report that the rendered audit could not be completed because Playwright MCP is not installed. Tell the user to run:
```
claude mcp add playwright npx @playwright/mcp@latest -- --sandbox
```
- If the current agent cannot access Playwright tools even though they exist elsewhere, state that the rendered phases are unavailable in this runtime and continue with static checks only.
If the site is offline or returns connection errors, report this immediately as a Critical Issue and skip to a recommendations-only audit

Detect page type from URL structure, title, headings, and content patterns. Read references/page-type-signals.md if ambiguous.

Page Type	Key Signals
Homepage	Root domain, brand name prominent, nav hub
Product Detail Page (PDP)	`/products/`, price, add-to-cart, product images
Product Listing Page (PLP)	`/category/`, `/collection/`, product grid
Blog Post / Article	`/blog/`, `/post/`, date, author byline
Landing Page	Minimal nav, single CTA, campaign-specific copy
Service Page	Service description, contact/quote CTA
Local Business	Address, phone, hours, map embed
SaaS Feature Page	Feature description, pricing CTA, testimonials
FAQ / Support	Q&A format, `/help/`, `/faq/`

State the page type and rendering method at the top of every report.

Before proceeding to Phase 2, create the output directories so screenshots never fail with ENOENT:

Bash: mkdir -p output/screenshots

Phase 2: Crawlability & Indexation

Robots.txt

WebFetch [domain]/robots.txt

Check: important paths blocked? Sitemap reference present? Conflicting rules?

XML Sitemap

WebFetch [domain]/sitemap.xml

Apply Steps 1–3 to every sitemap fetched. Then run the type-specific checks in Steps 4–7 based on which sitemap extensions are present.

Step 1: Discovery and availability

Referenced in robots.txt via Sitemap: directive? If not, Google must discover it independently — flag as Medium Priority.
Common fallback paths to try if not in robots.txt: /sitemap.xml, /sitemap_index.xml, /sitemap.xml.gz
Returns 200? (404 or 500 = Critical)
Is it a sitemap index (<sitemapindex> with <sitemap> children) or a plain sitemap (<urlset> with <url> entries)?
If sitemap index: list all child sitemaps. Fetch the child covering the audited page at minimum. Note whether child sitemaps are split by type (blog, products, images, video, news) — good practice.
Does the audited page URL appear in any sitemap? If not, flag as High Priority.
Do all <loc> URLs use canonical form — correct protocol, www preference, consistent trailing slash?
File size: standard sitemaps must not exceed 50MB uncompressed or 50,000 URLs. Flag if approaching limits.

Step 2: <lastmod> format validation

<lastmod> must follow the W3C Datetime format (ISO 8601 subset). Sample 5–10 values and check the format used:

Format	Example	Valid?
`YYYY-MM-DDThh:mm:ss+TZ`	`2024-03-15T09:30:00+00:00`	✅ Preferred
`YYYY-MM-DDThh:mm:ssZ`	`2024-03-15T09:30:00Z`	✅ Preferred
`YYYY-MM-DD`	`2024-03-15`	✅ Acceptable
`YYYY-MM`	`2024-03`	⚠️ Imprecise
`YYYY`	`2024`	⚠️ Too imprecise
Anything else	`03/15/2024`, `March 2024`	❌ Invalid — Google ignores

Flag as Medium Priority if any non-W3C format is found.

Step 3: <lastmod> staleness check

Pattern A — All dates identical: CMS is stamping on regeneration, not real changes. Flag as Medium Priority — Google learns to distrust this site's <lastmod> signals.

Pattern B — Age-based staleness (compare against today 2026-04-15):

Page type	Stale threshold	Severity
Homepage	> 30 days	🟡 Medium
Blog / Article	> 90 days since last visible update	🟡 Medium
PDP	> 60 days	🔴 High
PLP / Category	> 60 days	🟡 Medium
Static pages	> 365 days	Informational

Future <lastmod>: flag High Priority — signals

deep-seo-audit