Blog Post Counter Skill

Counts and compares blog post output across a target company and its competitors, ranking them so the user can see where their target stands.

Workflow

Step 1 — Resolve each company to a URL

For each company name provided:

If a URL was given directly, use it.
If only a name was given, run curl -sL "https://<likely-domain>/robots.txt" to confirm the site exists and find the sitemap. If that 404s, do a quick web search for "<company name>" official website to get the correct domain.

Step 2 — Find the sitemap

From the robots.txt, extract the Sitemap: line(s). Common patterns:

Single sitemap: Sitemap: https://example.com/sitemap.xml
Sitemap index: multiple Sitemap: lines, or a sitemap-index.xml that references sub-sitemaps

If no sitemap is in robots.txt, try these fallbacks in order:

/sitemap.xml
/sitemap_index.xml
/sitemap-index.xml
/blog/sitemap.xml

Step 3 — Count blog posts

Fetch the sitemap and count URLs that are blog posts. Use this bash pattern:

curl -sL -A "Mozilla/5.0" "https://example.com/sitemap.xml" \
  | grep -o '<loc>[^<]*</loc>' \
  | grep -iE '/blog/|/posts?/|/articles?/|/news/' \
  | grep -v -E '^<loc>https://[^/]+/blog/?</loc>$' \
  | wc -l

Important: Exclude the blog index page itself (e.g. /blog or /blog/) — count only individual post URLs.

Sitemap index handling: If the sitemap is an index (contains <sitemap> tags rather than <url> tags), extract the sub-sitemap URLs and fetch the blog-specific one:

curl -sL -A "Mozilla/5.0" "https://example.com/sitemap-index.xml" \
  | grep -o '<loc>[^<]*</loc>' \
  | grep -i 'blog'
# Then fetch that sub-sitemap and count

JS-rendered sites: Some blog pages render via JavaScript and may not expose counts through sitemaps. In that case, fall back to site:example.com/blog search operator to estimate.

Edge cases:

Some sites use /resources/, /insights/, /learn/, or /hub/ instead of /blog/ — check the sitemap structure if a /blog/ grep returns 0.
If the sitemap is very large (>1MB), grep for multiple blog path patterns.

Step 4 — Build the output

Once all counts are collected, produce a ranked table with the target company highlighted:

Blog Post Count — [Target] vs Competitors
==========================================

Rank  Company          Posts   URL
────  ───────────────  ──────  ──────────────────────
 1    Hackmamba        137     hackmamba.io           ← COMPETITOR
 2    Infrasity        137     infrasity.com          ← COMPETITOR  
 3  ▶ Kubiya           95      kubiya.ai              ← TARGET
 4    Orgn             7       orgn.com               ← COMPETITOR

▶ = your company   Total companies analysed: 4

Then add a brief summary:

Where the target ranks (e.g. "3rd out of 4")
Gap to the leader (e.g. "42 posts behind the top competitor")
Gap to the one above (if not already #1)
Any notable observations (e.g. very new site, or tied for first)

Output Format

Blog Post Count — [Target] vs Competitors
==========================================

Rank  Company          Posts   URL
────  ───────────────  ──────  ──────────────────────
 1    [Competitor A]   NNN     competitor-a.com       ← COMPETITOR
 2  ▶ [Target]         NNN     target.com             ← TARGET
 3    [Competitor B]   NNN     competitor-b.com       ← COMPETITOR

▶ = your company   Total companies analysed: N

Notes

Always count from the sitemap — it's the most accurate and complete source.
Robots.txt is always the first place to look for the sitemap URL.
The /blog index page should never be counted as a post.
If a company has no blog section at all, report 0 and note it.
Sitemap data can lag real-time by days; note this if relevant.
Run all company lookups in sequence (one bash block per company) to avoid hitting rate limits.

blog-post-counter

How to add

Drop this on your repo README

Related skills

doc-coauthoring

algorithmic-art

seo-aeo-blog-writer

wordpress-centric-high-seo-optimized-blogwriting-skill

Get new Escrita e Conteúdo skills every Monday