Semantic Topic Clustering (v1.9.0)
SERP-overlap-driven keyword clustering for content architecture. Groups keywords by how Google actually ranks them (shared top-10 results), not by text similarity. Designs hub-and-spoke content clusters with internal link matrices and generates interactive cluster map visualizations.
Scripts: Located at the plugin root scripts/ directory.
Quick Reference
| Command | What it does |
|---|---|
/seo cluster plan <seed-keyword> | Full planning workflow: expand, cluster, architect, visualize |
/seo cluster plan --from strategy | Import from existing /seo plan output |
/seo cluster execute | Execute plan: create content via claude-blog or output briefs |
/seo cluster map | Regenerate the interactive cluster visualization |
Planning Workflow
Step 1: Seed Keyword Expansion
Expand the seed keyword into 30-50 variants using WebSearch:
- Related searches — Search the seed, extract "related searches" and "people also search for"
- People Also Ask (PAA) — Extract all PAA questions from SERP results
- Long-tail modifiers — Append common modifiers: "best", "how to", "vs", "for beginners", "tools", "examples", "guide", "template", "mistakes", "checklist"
- Question mining — Generate who/what/when/where/why/how variants
- Intent modifiers — Add commercial modifiers: "pricing", "review", "alternative", "comparison", "free", "top"
Deduplication: Normalize variants (lowercase, strip articles), remove exact duplicates. Target: 30-50 unique keyword variants. If under 30, run a second expansion pass with the top PAA questions as seeds.
Step 2: SERP Overlap Clustering
This is the core differentiator. Load references/serp-overlap-methodology.md for
the full algorithm.
Process:
- Group keywords by initial intent guess (reduces pairwise comparisons)
- For each candidate pair within a group, WebSearch both keywords
- Count shared URLs in the top 10 organic results (ignore ads, featured snippets, PAA)
- Apply thresholds:
| Shared Results | Relationship | Action |
|---|---|---|
| 7-10 | Same post | Merge into single target page |
| 4-6 | Same cluster | Group under same spoke cluster |
| 2-3 | Interlink | Place in adjacent clusters, add cross-links |
| 0-1 | Separate | Assign to different clusters or exclude |
Optimization: With 40 keywords, full pairwise = 780 comparisons. Instead:
- Pre-group by intent (4 groups of ~10 = 4 x 45 = 180 comparisons)
- Only cross-check group boundary keywords
- Skip pairs where both are long-tail variants of the same head term (assume same cluster)
DataForSEO integration: If DataForSEO MCP is available, use serp_organic_live_advanced
instead of WebSearch for SERP data. Run python scripts/dataforseo_costs.py check serp_organic_live_advanced --count N
before each batch. If "status": "needs_approval", show cost estimate and ask user.
If "status": "blocked", fall back to WebSearch.
Step 3: Intent Classification
Classify each keyword into one of four intent categories:
| Intent | Signals | Include in Clusters? |
|---|---|---|
| Informational | how, what, why, guide, tutorial, learn | Yes |
| Commercial | best, top, review, comparison, vs, alternative | Yes |
| Transactional | buy, price, discount, coupon, order, sign up | Yes |
| Navigational | brand names, specific product names, login | No (exclude) |
Remove navigational keywords from clustering. Flag borderline cases for manual review. Keywords can have mixed intent (e.g., "best CRM software" is both commercial and informational) -- classify by dominant intent.
Step 4: Hub-and-Spoke Architecture
Load references/hub-spoke-architecture.md for full specifications.
Design the cluster structure:
- Select the pillar keyword — Highest volume, broadest intent, most SERP overlap with other keywords
- Group spokes into clusters — Each cluster is a subtopic area (2-5 clusters per pillar)
- Assign posts to clusters — Each cluster gets 2-4 spoke posts
- Select templates per post — Based on intent classification:
| Intent Pattern | Template Options |
|---|---|
| Informational (broad) | ultimate-guide |
| Informational (how) | how-to |
| Informational (list) | listicle |
| Informational (concept) | explainer |
| Commercial (compare) | comparison |
| Commercial (evaluate) | review |
| Commercial (rank) | best-of |
| Transactional | landing-page |
-
Set word count targets:
- Pillar page: 2500-4000 words
- Spoke posts: 1200-1800 words
-
Cannibalization check — No two posts share the same primary keyword. If SERP overlap is 7+, merge those keywords into a single post targeting both.
Step 5: Internal Link Matrix
Design the bidirectional linking structure:
| Link Type | Direction | Requirement |
|---|---|---|
| Spoke to pillar | spoke -> pillar | Mandatory (every spoke) |
| Pillar to spoke | pillar -> spoke | Mandatory (every spoke) |
| Spoke to spoke (within cluster) | spoke <-> spoke | 2-3 links per post |
| Cross-cluster | spoke -> spoke (other cluster) | 0-1 links per post |
Rules:
- Every post must have minimum 3 incoming internal links
- No orphan pages (every post reachable from pillar in 2 clicks)
- Anchor text must use target keyword or close variant (no "click here")
- Link placement: within body content, not just navigation/sidebar
Generate the link matrix as a JSON adjacency list:
{
"links": [
{ "from": "pillar", "to": "cluster-0-post-0", "type": "mandatory", "anchor": "keyword" },
{ "from": "cluster-0-post-0", "to": "pillar", "type": "mandatory", "anchor": "keyword" }
]
}
Step 6: Interactive Cluster Map
Generate cluster-map.html using the template at templates/cluster-map.html.
- Read the template file
- Build the
CLUSTER_DATAJSON object from the cluster plan:{ pillar: { title, keyword, volume, template, wordCount, url }, clusters: [{ name, color, posts: [{ title, keyword, volume, template, wordCount, url, status }] }], links: [{ from, to, type }], meta: { totalPosts, totalClusters, totalLinks, estimatedWords } } - Replace the
CLUSTER_DATAplaceholder in the template with the actual JSON - Write the completed HTML file to the output directory
- Inform user: "Open
cluster-map.htmlin a browser to explore the interactive cluster map."
Strategy Import
When invoked with --from strategy:
- Look for the most recent
/seo planoutput in the current directory (search for files matching*SEO*Plan*,*strategy*,*content-strategy*) - Parse markdown tables for: keywords, page types, content pillars, URL structures
- Validate extracted data: check for duplicates, missing keywords, incomplete entries
- Enrich with SERP data: run SERP overlap analysis on extracted keywords
- Build cluster plan using the imported keywords as the starting set (skip Step 1)
If no strategy file is found, prompt the user: "No existing SEO plan found in the
current directory. Run /seo plan first, or provide a seed keyword for fresh clustering."
Execution Workflow
When /seo cluster execute is invoked:
Check for claude-blog
Test: Does ~/.claude/skills/blog/SKILL.md exist?
If claude-blog IS installed:
- Load
references/execution-workflow.mdfor the full algorithm - Read
cluster-plan.jsonfrom the current directory - Check for resume state: scan output directory for already-written posts
- Execute in priority order: pillar first, then spokes by volume (highest first)
- For each post, invoke the
blog-writeskill with cluster context:- Cluster role (pillar or spoke)
- Position in cluster (cluster index, post index)
- Target keyword and secondary keywords
- Template type and word count target
- Internal links to includ