Anthropic-Grade Optimizer

Audits any Claude-directing artifact against the official Anthropic doctrine, calibrates findings by target model, and proposes surgical optimizations that preserve authorial voice. Every finding cites a verbatim source URL; the skill ships with cited rules only.

Hierarchy of authority
Honest scope
Quick start (and worked examples)
The Three Laws
Artifact types (9)
Target-model modulation
The 11 dimensions
Severity → triage
Emphasis conflict (D9, type-aware)
Detection methods
Operating modes
Adaptive modes (auto-trigger)
Output format (concise default)
Workflow (10 steps)
Scope discipline (positive framing)
Edge cases
Self-audit: Open Questions
References (10 files)
Assets (canonical snippets)
Scripts (6 entry points)
Validation (eval suite + strange-loop)

Hierarchy of authority

Anthropic doctrine is the sole source of scoring rules. Optional interpretive lenses inform how to reason about findings (what to preserve, when to defer, how to phrase recommendations) — they stay outside the rubric. On collision, Anthropic always wins.

Honest scope

189 unique cited rules across 11 dimensions, each with a verbatim quote and source URL. Read references/rules-anthropic.yaml § meta.total_unique_rules for the canonical count — every other counter in the skill must read from there to prevent drift. Rules with deterministic detection (regex, code-check) are audited automatically by scripts/pass1_mechanical.py. Rules with qualitative criteria (llm-judge, heuristic) are audited by following references/pass2-protocol.md. Coverage gaps and known limitations live in references/gaps.md and must appear in the coverage_caveat block of every report.

Pass-1 deterministic coverage in v1.2: ~52 rules (~28% of 189). Pass-2 covers the remaining ~137 qualitative rules. Hybrid detection is supported via the requires_pass2_grade flag on findings.

Quick start

Canonical entry point (default):

python scripts/run.py <artifact_path> --target opus-4-7 --mode audit

The orchestrator chains classify → pass 1 → score → emit, and surfaces Pass 2 prompts for the LLM. This is the single right answer for ~95% of audits.

Manual flow is for one specific case: an operator forcing fine-grained control over a single phase (re-running pass 1 only, scoring an external findings file, debugging the classifier). Do not use it as a parallel path — the orchestrator is the contract.

Ingest — receive the artifact path or content, target model (default opus-4-7), mode (default audit).
Classify — Run scripts/classify_artifact.py to detect artifact type and load the rule subset from references/rubric-by-type.yaml.
Audit — Pass 1 mechanical via scripts/pass1_mechanical.py. Pass 2 qualitative reasoning following references/pass2-protocol.md against references/rules-anthropic.yaml.
Diagnose — Classify each finding as 🔴 must-fix, 🟡 should-fix, 🟢 may-fix, ❓ open-question, or ⚪ preserve (authorial voice).
Optimize (when mode = optimize or full) — Produce surgical diffs per references/pass2-protocol.md § Diff Generation. Each diff cites source_url + verbatim_quote. When a rule has a canonical snippet, emit a verbatim patch from assets/snippets/ rather than paraphrasing.
Validate (when mode = full) — Re-score post-diff; abort when a hard rule is introduced or voice drift exceeds 10%.
Emit — Concise report (summary + scorecard + diff). Verbose mode adds reasoning trail, preservation log, and open questions.

Worked example 1 — auditing a SKILL.md for Opus 4.7

<example> Operator request: "audit this skill for Opus 4.7" Artifact: `~/.claude/skills/pdf-tools/SKILL.md` (240 lines)

Step 1 — Classify: type=skill, has_frontmatter=true, body_lines=235. Step 2 — Load rubric: 24 D-SKILL rules + 17 D-CLAR + 10 D-STRUCT + 8 D-EXAMPLE + 5 D-EVAL + safety. Suppress AR-CC-S09 doctrine-conflict? No (skill body). Step 3 — Pass 1 fires: AR-CC-S20 (lib mentioned without pip install), AR-CC-S22 (3 script refs unframed), AR-CLAR-006 (7 negatives, 1 positive alt). Step 4 — Triage: 1 🔴 (AR-CC-S20), 2 🟢 (S22, CLAR-006). Step 5 — Optimize emits snippet patches: none (no rule with canonical snippet fires). Inline diff for AR-CC-S20 adds "Install required package: pip install pypdf". Step 6 — Validate: post-diff score 92 (was 78). Voice drift 4% — under 10% gate. Step 7 — Emit concise report. </example>

Worked example 2 — strange-loop self-audit

<example> Operator request: "run the skill on its own SKILL.md" Artifact: `anthropic-grade-optimizer/SKILL.md`

Step 3 — Pass 1 fires: AR-CC-S14 (name contains "anthropic" reserved word) and possibly AR-CC-S21 (TOC) and AR-CC-S22 (script framing). Step 4 — AR-CC-S14: see § Self-audit: Open Questions below for the operator's documented decision (semantic-justification exception). Step 5 — No-op for the AR-CC-S14 finding (declared exception). Step 7 — Concise report flags the open question and links to the §. </example>

Worked example 3 — auditing an api_config snippet

<example> Operator request: "is this Python snippet safe for Opus 4.7?" Artifact: a `client.messages.create(...)` call with `temperature=0.7`, `effort='low'`, last message role=assistant. Type: `api_config`. Target: `claude-opus-4-7`.

Step 3 — Pass 1 fires: AR-MODEL-002 prefill (HARD, last role=assistant); AP-15 sampling param temperature (HARD); AR-REASON-017 effort=low on opus-4-7 with coding signal (severity_amplification → HARD); AR-MODEL-021..025 emitted as one Open Question with 5 options (collapsed via open_question=True). Step 4 — Triage: 3 🔴 (002, AP-15, REASON-017), 1 ❓ (021..025). Step 5 — Optimize emits inline diffs for AR-MODEL-002 / AP-15 (remove sampling params, move continuation to user message per AR-MODEL-024 if that pattern is the operator's choice). </example>

The Three Laws

These three laws encode the discipline that separates Anthropic-grade from "looks rigorous":

Cite or stay silent. Every 🔴 / 🟡 finding carries a source_url. When a source is absent, the finding is downgraded to EXTERNAL_ENRICHMENT or dropped — Anthropic-grade ships only cited rules.
Artifact type comes first. Apply the rule subset for the detected type. Firing a SKILL.md rule against a CLAUDE.md is a false positive — see references/rubric-by-type.yaml § false_positive_rules.
Voice drift trumps score. Raising the score by diluting the operator's voice is a regression in disguise. Optimizations with voice_drift > 10% abort; with --push-ceiling the gate tightens to 5%.

Artifact types (9)

Detection happens on filename plus content; each type loads a tailored rule subset from references/rubric-by-type.yaml:

Type	Path signal	Primary dimensions
`claude_md`	`CLAUDE.md`, `CLAUDE.local.md`	D-CC (memory), D-CLAR, D-STRUCT
`skill`	`SKILL.md` in `skills/<name>/`	D-CC (skill), D-CLAR, D-STRUCT
`slash_command`	`.claude/commands/<name>.md` (legacy)	D-CC (skill subset)
`subagent`	`.claude/agents/<name>.md`	D-CC (subagent), D-AGENT, D-CLAR
`hook_config`	`settings.json` `hooks` key	D-CC (hooks)
`mcp_config`	`.mcp.json`, `.claude.json`	D-CC (mcp)
`system_prompt` / `user_prompt`	inline / API artifact	D-CLAR, D-STRUCT, D-EXAMPLE, D-REASON, D-CONTEXT, D-MODEL, D-TOOL, D-VISION
`api_config`	Python/JSON snippet with `client.messages.create` or `model="claude-..."`	D-MODEL, D-REASON, D-CONTEXT (cache), D-TOOL, D-VISION
`workflow`	YAML pipeline	D-AGENT, D-EVAL

Target-model modulation

Each model has a profile cell in references/modulation-matrix.yaml. Critical anti-patterns flagged automatically:

Opus 4.7 — rejects temperature

anthropic-grade-optimizer

How to add

Drop this on your repo README

Related skills

MoneyPrinterTurbo

weather-svg-creator

azure-keyvault-secrets-rust

azure-monitor-ingestion-py

Get new Automação skills every Monday