GEO Technical SEO Audit

Purpose

Technical SEO forms the foundation of both traditional search visibility and AI search citation. A technically broken site cannot be crawled, indexed, or cited by any platform. This skill audits 8 categories of technical health with specific attention to GEO requirements — most critically, server-side rendering (AI crawlers do not execute JavaScript) and AI crawler access (many sites inadvertently block AI crawlers in robots.txt).

How to Use This Skill

Collect the target URL (homepage + 2-3 key inner pages)
Fetch each page using curl/WebFetch to get raw HTML and HTTP headers
Run through each of the 8 audit categories below
Score each category using the rubric
Generate GEO-TECHNICAL-AUDIT.md with results

Category 1: Crawlability (15 points)

1.1 robots.txt Validity

Fetch https://[domain]/robots.txt
Check for syntactic validity: proper User-agent, Allow, Disallow directives
Check for common errors: missing User-agent, wildcards blocking important paths, Disallow: / blocking entire site
Verify XML sitemap is referenced: Sitemap: https://[domain]/sitemap.xml

1.2 AI Crawler Access (CRITICAL for GEO)

Check robots.txt for directives targeting these AI crawlers:

Crawler	User-Agent	Platform
GPTBot	GPTBot	ChatGPT / OpenAI
Google-Extended	Google-Extended	Gemini / Google AI training
Googlebot	Googlebot	Google Search + AI Overviews
Bingbot	bingbot	Bing Copilot + ChatGPT (via Bing)
PerplexityBot	PerplexityBot	Perplexity AI
ClaudeBot	ClaudeBot	Anthropic Claude
Amazonbot	Amazonbot	Alexa / Amazon AI
CCBot	CCBot	Common Crawl (used by many AI models)
FacebookBot	FacebookExternalHit	Meta AI
Bytespider	Bytespider	TikTok / ByteDance AI
Applebot-Extended	Applebot-Extended	Apple Intelligence

Scoring for AI crawler access:

All major AI crawlers allowed: 5 points
Some blocked but Googlebot + Bingbot allowed: 3 points
GPTBot or PerplexityBot blocked: 1 point (significant GEO impact)
Googlebot blocked: 0 points (fatal)

Important nuance: Blocking Google-Extended does NOT block Googlebot. Google-Extended only controls AI training data usage, not search indexing. However, blocking Google-Extended may reduce presence in AI Overviews. Recommend allowing Google-Extended unless there is a specific data licensing concern.

1.3 XML Sitemaps

Fetch sitemap (check robots.txt for location, or try /sitemap.xml, /sitemap_index.xml)
Validate XML syntax
Check for <lastmod> dates (should be present and accurate)
Count URLs — compare to expected number of indexable pages
Check for sitemap index if large site (50,000+ URLs per sitemap max)
Verify all sitemap URLs return 200 status codes (sample check)

1.4 Crawl Depth

Homepage = depth 0. Check that all important pages are reachable within 3 clicks (depth 3)
Pages at depth 4+ receive significantly less crawl budget and are less likely to be cited by AI
Check internal linking: are key content pages linked from the homepage or main navigation?

1.5 Noindex Management

Check for <meta name="robots" content="noindex"> on pages that SHOULD be indexed
Check for X-Robots-Tag: noindex HTTP headers
Common mistakes: noindex on paginated pages, category pages, or key landing pages

Category Scoring:

Check	Points
robots.txt valid and complete	3
AI crawlers allowed	5
XML sitemap present and valid	3
Crawl depth within 3 clicks	2
No erroneous noindex directives	2

Category 2: Indexability (12 points)

2.1 Canonical Tags

Every indexable page must have a <link rel="canonical" href="..."> tag
Canonical must point to itself (self-referencing) for the authoritative version
Check for conflicting canonicals (canonical in HTML vs. HTTP header)
Check for canonical chains (A canonicals to B, B canonicals to C — should be A to C)

2.2 Duplicate Content

Check for www vs. non-www (both should resolve, one should redirect)
Check for HTTP vs. HTTPS (HTTP should redirect to HTTPS)
Check for trailing slash consistency (pick one pattern and redirect the other)
Check for parameter-based duplicates (?sort=price creating duplicate pages)

2.3 Pagination

If paginated content exists, check for rel="next" / rel="prev" (note: Google ignores these as of 2019, but Bing still uses them)
Preferred: use rel="canonical" on paginated pages pointing to a view-all page or the first page
Ensure paginated pages are not noindexed if they contain unique content

2.4 Hreflang (international sites)

Check for <link rel="alternate" hreflang="xx"> tags
Validate: reciprocal hreflang (if page A points to page B, B must point back to A)
Validate: x-default fallback exists
Check for language/region code validity (ISO 639-1 / ISO 3166-1)

2.5 Index Bloat

Estimate number of indexed pages (check sitemap count, use site:domain.com estimate)
Compare indexed pages to actual valuable content pages
Flag if indexed pages significantly exceed content pages (index bloat from thin/duplicate/parameter pages)

Category Scoring:

Check	Points
Canonical tags correct on all pages	3
No duplicate content issues	3
Pagination handled correctly	2
Hreflang correct (if applicable)	2
No index bloat	2

Category 3: Security (10 points)

3.1 HTTPS Enforcement

Site must load over HTTPS
HTTP must redirect to HTTPS (301 redirect)
No mixed content warnings (HTTP resources on HTTPS pages)
SSL/TLS certificate must be valid and not expired

3.2 Security Headers

Check HTTP response headers for:

Header	Required Value	Purpose
`Strict-Transport-Security`	`max-age=31536000; includeSubDomains`	Forces HTTPS
`Content-Security-Policy`	Appropriate policy	Prevents XSS
`X-Content-Type-Options`	`nosniff`	Prevents MIME sniffing
`X-Frame-Options`	`DENY` or `SAMEORIGIN`	Prevents clickjacking
`Referrer-Policy`	`strict-origin-when-cross-origin` or stricter	Controls referrer data
`Permissions-Policy`	Appropriate restrictions	Controls browser features

Category Scoring:

Check	Points
HTTPS enforced with valid cert	4
HSTS header present	2
X-Content-Type-Options	1
X-Frame-Options	1
Referrer-Policy	1
Content-Security-Policy	1

Category 4: URL Structure (8 points)

4.1 Clean URLs

URLs should be human-readable: /blog/seo-guide not /blog?id=12345
No session IDs in URLs
Lowercase only (no mixed case)
Hyphens for word separation (not underscores)
No special characters or encoded spaces

4.2 Logical Hierarchy

URL path should reflect site architecture: /category/subcategory/page
Flat where appropriate — avoid unnecessarily deep nesting
Consistent pattern across the site

4.3 Redirect Chains

Check for redirect chains (A redirects to B redirects to C)
Maximum 1 hop recommended (A redirects to C directly)
Check for redirect loops
All redirects should be 301 (permanent), not 302 (temporary), unless intentionally temporary

4.4 Parameter Handling

URL parameters should not create duplicate indexable pages
Use canonical tags or robots.txt Disallow for parameter variations
Configure parameter handling in Google Search Console and Bing Webmaster Tools

Category Scoring:

Check	Points
Clean, readable URLs	2
Logical hierarchy	2
No redirect chains (max 1 hop)	2
Parameter handling configured	2

Category 5: Mobile Optimization (10 points)

Critical Context

As of July 2024, Google crawls ALL sites exclusively with mobile Googlebot. There is no desktop crawling. If your site does not work on mobile, it does not work for Google. Period.

5.1 Responsive Design

Check f

geo-technical

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

security-research

security-audit

security-compliance-compliance-check

security-auditor

Recibe nuevas skills de Segurança todos los lunes

GEO Technical SEO Audit

Purpose

How to Use This Skill

Category 1: Crawlability (15 points)

1.1 robots.txt Validity

1.2 AI Crawler Access (CRITICAL for GEO)

1.3 XML Sitemaps

1.4 Crawl Depth

1.5 Noindex Management

Category 2: Indexability (12 points)

2.1 Canonical Tags

2.2 Duplicate Content

2.3 Pagination

2.4 Hreflang (international sites)

2.5 Index Bloat

Category 3: Security (10 points)

3.1 HTTPS Enforcement

3.2 Security Headers

Category 4: URL Structure (8 points)

4.1 Clean URLs

4.2 Logical Hierarchy

4.3 Redirect Chains

4.4 Parameter Handling

Category 5: Mobile Optimization (10 points)

Critical Context

5.1 Responsive Design

Comentarios · Sin comentarios