Blog Analyzer: Quality Audit & Scoring

Scores blog posts on a 0-100 scale across 5 categories and provides prioritized improvement recommendations. Includes AI content detection analysis. Works with local files or published URLs.

Reference documents (paths from repo root):

skills/blog/references/quality-scoring.md: full scoring checklist
skills/blog/references/eeat-signals.md: E-E-A-T evaluation criteria
skills/blog/references/ai-slop-detection.md: two-tier reflex methodology (v1.8.0)
skills/blog/references/editorial-heuristics.md: ordinal 0-4 rubric, P0-P3 severity (v1.8.0, used with --rubric)
skills/blog/references/cognitive-load.md: per-section concept density (v1.8.0, used with --cognitive-load)

Input Handling

Local file: Read the file directly
URL: Fetch with WebFetch, extract content
Directory: Scan for blog files, audit all (batch mode)
Flags: --format json|table, --batch, --sort score, --rubric, --cognitive-load

Optional Modes (v1.8.0)

--rubric: in addition to the 100-point score, emit the ordinal 0-4 editorial-heuristics rubric with P0-P3 severity tags. See skills/blog/references/editorial-heuristics.md. The 100-point JSON schema is preserved; the rubric is added as a sibling rubric field.
--cognitive-load: run scripts/cognitive_load.py against the post and embed the per-section load heatmap as a sibling cognitive_load field. See skills/blog/references/cognitive-load.md.

Both modes are additive. The default behavior (no flags) is unchanged from v1.7.1.

Scoring Process

Step 1: Content Extraction

Read the blog post and extract:

Frontmatter (title, description, date, lastUpdated, author, tags)
Heading structure (H1, H2, H3 with hierarchy)
Paragraph count and word counts per paragraph
Statistics (any number claims with or without sources)
Images (count, alt text presence, format)
Charts/SVGs (count, type diversity)
Links (internal, external, broken)
FAQ section presence
Schema markup (types present)
Meta tags (title, description, OG tags, twitter cards)
Sentence lengths for burstiness analysis
Vocabulary tokens for diversity scoring

Step 2: Score Each Category

Load references/quality-scoring.md for the full checklist. Score each:

Content Quality (30 points)

Check	Points	Pass Criteria
Depth/comprehensiveness	7	Covers topic thoroughly, no major gaps
Readability (Flesch 60-70)	7	Flesch 60-70 ideal, 55-75 acceptable; Grade 7-8; Gunning Fog 7-8
Originality/unique value markers	5	Original data, case studies, first-hand experience
Sentence & paragraph structure	4	Avg sentence 15-20 words, ≤25% over 20; paragraphs 40-80 words; H2 every 200-300 words
Engagement elements	4	Summary box, callouts, varied content blocks. Accepts: "TL;DR", "Key Takeaways", "The Bottom Line", "What You'll Learn", "At a Glance", "In Brief"
Grammar/anti-pattern	3	Passive voice ≤10%, AI trigger words ≤5/1K, transition words 20-30%, clean prose

Readability Bands (apply per persona, or use default):

Audience	Flesch Grade	Flesch Ease	Scoring Impact
Consumer	6-8	60-80	Full points if in range
Professional	8-10	50-60	Full points if in range
Technical	10-12	30-50	Full points if in range
Default (no persona)	7-8	60-70	Current scoring unchanged

Content clarity is the #2 factor for AI citation probability (+32.83% score differential). Average US adult reads at 7th-8th grade level.

SEO Optimization (25 points)

Check	Points	Pass Criteria
Heading hierarchy with keywords	5	H1 -> H2 -> H3, no skips, keyword in 2-3 headings
Title tag (40-60 chars, keyword, power word)	4	Front-loaded keyword, positive sentiment
Keyword placement/density	4	Natural integration, no stuffing, in first 100 words
Internal linking (3-10 contextual)	4	Descriptive anchor text, bidirectional
URL structure	3	Short, keyword-rich, no stop words, lowercase
Meta description (150-160 chars, stat)	3	Fact-dense, includes one statistic
External linking (tier 1-3)	2	3-8 outbound links to authoritative sources

E-E-A-T Signals (15 points)

Check	Points	Pass Criteria
Author attribution (named, with bio)	4	Real name, credentials, not sales pitch
Source citations (tier 1-3, inline)	4	8+ unique stats, zero fabricated
Trust indicators	4	Contact page, about page, editorial policy
Experience signals	3	"When we tested...", original photos/data

When scoring source citations under E-E-A-T, evaluate whether each public statistic carries the FLOW evidence triple: year anchor in prose, inline citation with publisher and title, URL with retrieval date in the source block. Posts that cite tier 1-3 sources but lack retrieval dates score lower on this subcategory than posts that include the full triple. See skills/blog/references/flow-alignment.md for the standard.

Technical Elements (15 points)

Check	Points	Pass Criteria
Schema markup (3+ types = bonus)	4	BlogPosting + FAQ + Person minimum
Image optimization	3	AVIF/WebP, descriptive alt text, lazy except LCP
Structured data elements	2	Tables, lists, comparison blocks
Page speed signals	2	LCP < 2.5s, no render-blocking JS
Mobile-friendliness	2	Responsive, tap targets 48px+
OG/social meta tags	2	og:title, og:description, og:image, twitter:card

AI Citation Readiness (15 points)

Check	Points	Pass Criteria
Passage-level citability (120-180 words)	4	Self-contained sections with stat + source
Q&A formatted sections	3	60-70% of H2s as questions, FAQ present
Entity clarity	3	Unambiguous topic entity, consistent terminology
Content structure for extraction	3	Answer-first, tables with thead, comparison formats
AI crawler accessibility	2	SSR/SSG, no JS-gated content

Step 3: AI Content Detection

Analyze the post for AI-generated content risk:

Burstiness Score (sentence length variance):

Calculate standard deviation of sentence lengths across the post
Human writing: high variance (short punchy + long complex sentences)
AI writing: low variance (consistently medium-length sentences)
Score: 0-10 scale (10 = very human-like burstiness)

Known AI Phrase Detection: flag occurrences of these 17 phrases:

"It's important to note"
"In today's digital landscape"
"Delve into"
"Navigating the complexities"
"Let's explore"
"Furthermore"
"In conclusion"
"It is worth mentioning"
"Embark on"
"Cutting-edge"
"Leverage" (as a verb, non-financial context)
"Game-changer"
"Revolutionize"
"Streamline"
"Harness the power"
"Dive deep"
"Unlock the potential"
Em dashes (-) - count all instances, flag as AI writing pattern

Vocabulary Diversity (Type-Token Ratio):

Calculate unique words / total words
Human writing: TTR typically 0.4-0.6 for long-form
AI writing: TTR often below 0.35 (repetitive vocabulary)

AI Content Risk Assessment:

Flag if AI probability > 50% based on combined signals
Provide specific passages that triggered the flag
Recommend humanization: personal anecdotes, varied sentence rhythm, domain jargon

Step 4: Determine Rating

Score	Rating	Action
90-100	Exceptional	Publish as-is, flagship content
80-89	Strong	Minor polish, ready for publication
70-79	Acceptable	Targeted improvements needed
60-69	Below Standard	Significant rework required
< 60	Rewrite	Fundamental issues, start from outline

blog-analyze

How to add

Drop this on your repo README

Related skills

algorithmic-art

doc-coauthoring

blog-writing-guide

agents-md

Get new Escrita e Conteúdo skills every Monday