Delivery Metrics Check

Assess delivery health using product-type-appropriate metrics. Check product_type from .claude/diamonds/active.yml to determine which assessment to run.

Product type routing (v0.11.0):

software: Full DORA + APEX assessment (Parts 1-3 below)
content_course, content_publication, content_media: Content Delivery Assessment (Part 4 below)
ai_tool: AI Tool Assessment (Part 5 below) + DORA/APEX if code components exist
service_offering: Service Delivery Assessment (Part 6 below)

Preflight: Read target canvas file(s) before any Write/Edit

Hard rule. Before issuing Write or Edit against any .claude/canvas/*.yml, use the Read tool on that file in this session. Claude Code's Read-before-Write check requires the Read tool specifically — cat/head/grep via Bash do NOT satisfy it.

Edit vs Write — different cost profiles (verified 2026-05-14):

Edit (exact-string replacement): Read with limit: 1 satisfies the check at ~50 tokens. State-tracking is per-file, not per-byte — subsequent Edit calls work anywhere in the file. Use this for partial updates against large canvas files (e.g., purpose.yml at 800+ lines).
Write (full replacement): do a full Read first. Write obliterates the file; you should see what you're about to replace. The limit:1 shortcut is not appropriate here.

ID-bearing entries — scan the ID space before assigning (added 2026-05-15, v0.23.19): When adding a new component, opportunity, solution, or any other ID-bearing entry to a canvas file, run a Bash grep first to confirm the next ID in your prefix sequence is actually free:

grep "^  - id: <prefix>-" .claude/canvas/<file>.yml | sort -u

Replace <prefix> with the canvas's ID prefix (comp for landscape, opp for opportunities, sol for solutions, ht for human-tasks, etc.). Then pick the next free integer. validate_canvas.py has a duplicate-ID check (lines 230-239) that catches the failure on CI, but a duplicate can persist in the working tree for days if CI isn't run between edit and discovery — see roadmap-repo corrections.md 2026-05-15 "Duplicate canvas ID created in landscape.yml" for the worked example.

Original failure mode: anti-pattern #7 instance #5, 2026-05-09 — agent conflated Bash head with the Read tool, lost ~14k tokens to a Write-fail → remedial-full-Read → re-Write loop. The limit:1 discipline (graduated 2026-05-14, v0.23.18) prevents the second-order cost where the agent correctly follows the rule but full-Reads every time. The ID-scan discipline (graduated 2026-05-15, v0.23.19) prevents the related class where the agent reads enough of the file to satisfy the Edit check but not enough to see existing ID assignments — kin to anti-pattern #8 (Stale State Read).

If this skill writes to multiple canvas files, register each one first (limit:1 for Edit-only paths; full Read for Write paths) AND ID-scan any prefix you intend to assign.

See CLAUDE.md Canvas writes — Read before Write for the canonical rule.

Software Products

Assess delivery health using Forsgren's five DORA metrics AND LinearB's APEX AI-era metrics.

Part 1: DORA Metrics (Forsgren)

Gather current metrics from CI/CD, deployment logs, incident records.

Note: DORA expanded from 4 to 5 metrics. "MTTR" was renamed to "Failed Deployment Recovery Time" (FDRT) for precision — the original name was ambiguous with other mean-time-to-X metrics. "Reliability" was added as the 5th metric in the 2024 State of DevOps report.

Deployment Frequency: How often does code reach production?

Elite: On-demand (multiple deploys/day)
High: Between once/day and once/week
Medium: Between once/week and once/month
Low: Less than once/month

Lead Time for Changes: Commit to production time?

Elite: Less than one hour
High: Between one day and one week
Medium: Between one week and one month
Low: More than one month

Change Failure Rate: % of deployments causing failure?

Elite: 0-15%
High: 16-30%
Medium: 31-45%
Low: 46-100%

Failed Deployment Recovery Time (FDRT): Time to restore service after a failed deployment?

Elite: Less than one hour
High: Less than one day
Medium: Between one day and one week
Low: More than one week

Formerly "Mean Time to Recovery (MTTR)." Renamed for precision — FDRT measures recovery from failed deployments specifically, not all incidents.

Reliability: Does the software meet or exceed its reliability targets?

Elite: Meets or exceeds targets
High: Slightly below targets
Medium: Moderately below targets
Low: Significantly below targets

Added in DORA 2024. Measures operational reliability via SLOs/SLIs. Connects to SRE metrics in Part 3.

Part 2: APEX Metrics (LinearB)

"Faster coding doesn't mean faster delivery."

Assess the four APEX pillars to detect AI-era delivery problems:

A — AI Leverage

What % of PRs/code changes are AI-generated or AI-assisted?
What is the AI suggestion acceptance rate? (Benchmark: 32.7% for AI vs 84.4% for human — LinearB 2026)
What is the AI rework rate? (% of AI code rewritten within 21 days)
Is AI code quality comparable to human code? (Check corrections.md origin field)

P — Predictability

Planning accuracy: % of planned work completed per cycle?
Rework rate: % of ALL code rewritten within 21 days?
Are delivery estimates getting more or less reliable with AI?

E — Flow Efficiency (The Shifting Bottleneck)

End-to-end cycle time: is it actually decreasing?
Review wait time: are PRs waiting longer before first review?
AI review wait ratio: do AI PRs wait longer than human PRs? (Benchmark: 4.6x — LinearB 2026)
KEY CHECK: Is coding faster but review/testing/deployment slower? If yes, the bottleneck has shifted. AI is generating code the pipeline can't absorb.

X — Developer Experience

Developer satisfaction with AI tools (survey or conversation)
Cognitive load: is AI helping or adding complexity?
Burnout signals: unsustainable pace? Context-switching? Alert fatigue?
Maps to BVSSH "Happier" dimension

Output

## DORA + APEX Assessment

### DORA Metrics
| Metric | Current | Level | Target | Gap |
|--------|---------|-------|--------|-----|
| Deploy freq | ... | ... | ... | ... |
| Lead time | ... | ... | ... | ... |
| Change fail rate | ... | ... | ... | ... |
| FDRT | ... | ... | ... | ... |
| Reliability | ... | ... | ... | ... |

### APEX Metrics (AI-Era)
| Pillar | Status | Key Signal |
|--------|--------|-----------|
| AI Leverage | ... | AI acceptance rate: ...% |
| Predictability | ... | Planning accuracy: ...%, Rework rate: ...% |
| Flow Efficiency | ... | Cycle time: ..., Review wait: ... |
| Developer Experience | ... | Satisfaction: ..., Burnout: ... |

### Shifting Bottleneck Check
[Is coding faster but review/deployment slower? Yes/No]
[If yes: where is the new bottleneck?]

### DORA Bottleneck
[The metric most constraining overall performance]

### Value Stream Diagnosis (if bottleneck detected)
If DORA shows a bottleneck, map the value stream to identify WHERE in the flow the constraint lives:
- Run `/mycelium:canvas-update` to update `.claude/canvas/value-stream.yml` with current stage timings
- Apply Theory of Constraints Five Focusing Steps (Goldratt): Identify -> Exploit -> Subordinate -> Elevate -> Repeat
- Look for wait times >> process times (a sign of queuing, not capacity, problems)
- Look for high handoff counts (each handoff adds delay and information loss)
- Calculate flow efficiency: process_time / lead_time -- target >25%

### Top 3 Improvements
1. [specific action]
2. [specific action]
3. [specific action]

Part 3: SRE Metrics (Error Budgets)

If SLIs/SLOs defined in .claude/canvas/dora-metrics.yml sre section:

Review each service's SLI values against SLO targets
Calculate error budget remaining: (SLO - actual) / (1 - SLO) * 100%
Healthy (>50%): Ship feature

dora-check

How to add

Drop this on your repo README

Related skills

algorithmic-art

doc-coauthoring

blog-writing-guide

agents-md

Get new Escrita e Conteúdo skills every Monday