AI Slop Detection
Slop is a density problem, not a word problem.
A single "delve" is fine. Five "delves" near a "tapestry" and an "embark" is generated text. This skill scores density per 100 words, marker clustering, and whether the overall register fits the document type. It does not ban words; it flags concentrations.
Execution Workflow
Identify target files and classify them as technical docs, narrative prose, or code comments. Classification feeds context-aware scoring: tier-1 markers in marketing copy score lower than the same markers in API reference.
Language Detection
- Auto-detect language from text content using function word frequency
- Override with explicit
--langparameter (en, de, fr, es) - Load language-specific patterns from
data/languages/{lang}.yaml - Fall back to English if detection confidence is low
- See
modules/language-handling.mdfor cultural calibration and concrete pattern sets
Vocabulary and Phrase Detection
Load: @modules/vocabulary-patterns.md
Markers fall into three confidence tiers. Tier 1 words ("delve", "multifaceted", "leverage") appear far more often in AI text than human text. Tier 2 covers context-dependent transitions ("moreover", "subsequently"). Tier 3 covers vapid phrases ("In today's fast-paced world", "cannot be overstated").
| Word | Context | Human Alternative |
|---|---|---|
| delve | "delve into" | explore, examine, look at |
| tapestry | "rich tapestry" | mix, combination, variety |
| realm | "in the realm of" | in, within, regarding |
| embark | "embark on a journey" | start, begin |
| beacon | "a beacon of" | example, model |
| spearheaded | formal attribution | led, started |
| multifaceted | describing complexity | complex, varied |
| comprehensive | describing scope | thorough, complete |
| pivotal | importance marker | key, important |
| nuanced | sophistication signal | subtle, detailed |
| meticulous/meticulously | care marker | careful, detailed |
| intricate | complexity marker | detailed, complex |
| showcasing | display verb | showing, displaying |
| leveraging | business jargon | using |
| streamline | optimization verb | simplify, improve |
Tier 2: Medium-Confidence Markers (Score: 2 each)
Common but context-dependent:
| Category | Words |
|---|---|
| Transition overuse | moreover, furthermore, indeed, notably, subsequently |
| Intensity clustering | significantly, substantially, fundamentally, profoundly |
| Hedging stacks | potentially, typically, often, might, perhaps |
| Action inflation | revolutionize, transform, unlock, unleash, elevate |
| Empty emphasis | crucial, vital, essential, paramount |
Tier 3: Phrase Patterns (Score: 2-4 each)
| Phrase | Score | Issue |
|---|---|---|
| "In today's fast-paced world" | 4 | Vapid opener |
| "It's worth noting that" | 3 | Filler |
| "At its core" | 2 | Positional crutch |
| "Cannot be overstated" | 3 | Empty emphasis |
| "A testament to" | 3 | Attribution cliche |
| "Navigate the complexities" | 4 | Business speak |
| "Unlock the potential" | 4 | Marketing speak |
| "Treasure trove of" | 3 | Overused metaphor |
| "Game changer" | 3 | Buzzword |
| "Look no further" | 4 | Sales pitch |
| "Nestled in the heart of" | 4 | Travel writing cliche |
| "Embark on a journey" | 4 | Melodrama |
| "Ever-evolving landscape" | 4 | Tech cliche |
| "Hustle and bustle" | 3 | Filler |
Step 3: Structural Pattern Detection
Load: @modules/structural-patterns.md
Em Dash Overuse
The single most-cited 2026 AI tell across Wikipedia, the Field Guide, and the Algorithmic Bridge. Detection runs in two modes:
Audit mode (forensic, applied to unknown prose):
- 0-1 per 1000 words: Normal human range
- 2-4: Elevated, review usage
- 5+: Strong AI signal
Prevention mode (applied to docs the agent just generated):
- Target zero. Every em-dash is a finding.
- Replace with commas (asides), parentheses (tangents), colons
(definitions), or periods (separate thoughts). See
modules/structural-patterns.md§ Em Dash Analysis for the full replacement table.
# Count em dashes in file
grep -o '—' file.md | wc -l
Tricolon Detection
AI loves groups of three with alliteration:
- "fast, efficient, and reliable"
- "clear, concise, and compelling"
- "robust, reliable, and resilient"
Pattern: adjective, adjective, and adjective with similar sounds.
List-to-Prose Ratio
Count bullet points vs paragraph sentences:
- >60% bullets: AI tendency
- Emoji-led bullets: Strong AI signal in technical docs
Sentence Length Uniformity
Measure standard deviation of sentence lengths:
- Low variance (SD < 5 words): AI monotony
- High variance (SD > 10 words): Human variation
Paragraph Symmetry
AI produces "blocky" text with uniform paragraph lengths. Check whether paragraphs cluster around the same word count.
Step 4: Identity & Voice Leak Sweep (P0)
Load: @modules/identity-and-voice-leaks.md
Some patterns are not slop: they are direct evidence that AI generated text leaked into a published artifact. A single match in this class fails review independently of any other score.
Scan for:
- Identity leaks ("As a large language model", "as of my training cutoff", "I cannot provide") — severity: critical, no exceptions.
- Conversational voice leaks ("Hope this helps!", "Great question!", "Sure!") outside transcript blocks.
- Self-narration of structure ("In this section, we will cover...", "Let's dive into...", "By the end of this guide...").
- Hedging seesaw ("While X has its merits, it's not without its challenges").
- Contrastive constructions as paragraph openers: both contrastive negation ("not just X, but Y", "It's not X, it's Y") and affirmative antithesis ("Less X, more Y", "Where others X, we Y"). Avoid in all but the most necessary cases; keep only when the contrast carries information that survives removal.
See the module for the full pattern catalogue and false- positive guidance.
Step 4.5: Sycophantic Pattern Detection
Especially relevant for conversational or instructional content (complements Class 2 of the identity-and-voice-leaks module):
| Phrase | Issue |
|---|---|
| "I'd be happy to" | Servile opener |
| "Great question!" | Empty validation |
| "Absolutely!" | Over-agreement |
| "That's a wonderful point" | Flattery |
| "I'm glad you asked" | Filler |
| "You're absolutely right" | Sycophancy |
These phrases add no information and signal generated content.
Step 4.6: Tier 5 / 2026 Patterns (Prevention-Strict)
The 2026 cross-source consensus (Wikipedia Signs of AI
writing, Algorithmic Bridge 10 Signs, Ignorance.ai Field
Guide, Stop-Slop Claude skill, George Kao, ContentBeta,
OliviaCal) identifies a handful of shapes that dominate
post-GPT-5 / post-Claude-4.5 prose. Each is detailed in
@modules/vocabulary-patterns.md (lexical form) and
@modules/structural-patterns.md (structural form).
| Pattern | Form | Why it matters |
|---|---|---|
| Em-dash overuse | — used as rhetorical pause | Most-cited single tell of 2026 |
| Plus-sign for "and" | "hooks and skills" in prose | Strong: humans have "and" |
| Spatial copula | "lives in", "sits at", "stands as", "boasts" | Inanimate subject with animate verb |
| Negative parallelism (contrastive negation) | "Not X but Y", "No X. No Y. Just Z.", "No X, no Y, no Z", "It's not X, it's Y", "Y, not X" | Rhetorical scaffold with no argument |
| Contrastive parallelism (affirmative antithesis) | "Less X, more Y", "Where others X, we Y", "Humans propose; machines dispose" | Manufactured punch; same scaffold without the "not" |
| Throat-clearing openers | "Here's the thing,", "Look,", "Let that sink in." | Discourse markers signaling nothing |
| Three-fragment burst | "Focused. Aligned. Measurable." | Rhythm without information |
| Significance cluste |