Diamond Assess Skill

Name: diamond-assess
Rating: 5 (35 reviews)
Author: haabe

Evaluate current diamond state and recommend next action.

Workflow

Cognitive Forcing (ALWAYS FIRST — before any analysis):

Before presenting any assessment, ask the human for their unprimed judgment:

"Before I run the gates — where do you think this diamond stands right now? What feels solid and what feels shaky?"

Wait for the human's response. Record it. Then proceed with the full assessment below. After presenting the assessment (step 10), compare:

"You said [X]. The gates say [Y]. Where do we differ?"

This prevents the agent's analysis from anchoring the human's judgment. The human's pre-assessment often catches things the gates miss (Hoskins consistently outperformed the agent on product judgment calls).

Source: Buçinca, Malaya & Gajos (Cognitive Forcing Functions, Harvard CHI/CSCW 2021) — forcing initial human judgment before AI output significantly reduces automation bias and over-reliance on incorrect AI recommendations.
Identify the diamond: Which diamond (ID, scale, phase) is being assessed?
Gather current state:
- Current phase (Discover/Define/Develop/Deliver)
- Evidence collected so far
- Confidence score with breakdown
- Blockers or risks
Check theory gates for next transition:
- Reference ${CLAUDE_PLUGIN_ROOT}/engine/theory-gates.md for the current transition
- Check product_type from .claude/diamonds/active.yml -- gates conditioned on product_type include:
  - Security Gate: full OWASP for software/ai_tool; platform-only for content; infra-only for service
  - Delivery Metrics Gate: routes to product-type-appropriate metrics canvas
  - Service Quality Gate: Downe applies to consumption experience for all product types; Nielsen only for digital interfaces
- Evaluate each applicable gate: Pass / Fail / Insufficient Evidence / N/A (if gate doesn't apply to this product_type)
- Read-before-claim (HARD RULE; anti-pattern #7 instance #4, 2026-05-09): Before claiming a required-evidence bucket is missing or partial (e.g., "Wardley Map | Missing", "user_research_synthesis | Insufficient"), the agent MUST use the Read tool on the canvas file that bucket maps to (e.g., landscape.yml, user-needs.yml, opportunities.yml, gist.yml). Spawn-note text, theory-gates.md references, and prior conversation context do NOT count as evidence of the bucket's actual state — only reading the file does. Treating consistency between spawn-note phrasing and an absence-claim as causal evidence is anti-pattern #7 (Consistency-as-Evidence). The graduation case for instance #4 was the agent recommending "build the Wardley map now" when the map was substantially complete — the agent had not opened landscape.yml.
- Document what is missing for failed gates, naming the specific canvas file read and the specific field that is empty/incomplete (not the inference that it should be empty).
Check confidence threshold:
- Reference ${CLAUDE_PLUGIN_ROOT}/engine/confidence-thresholds.yml for the current scale
- Apply project_type_adaptations to compute effective threshold (see ${CLAUDE_PLUGIN_ROOT}/engine/confidence-thresholds.yml)
- Compare current confidence to the effective threshold
- Identify what would increase confidence
Check for anti-patterns:
- Reference ${CLAUDE_PLUGIN_ROOT}/harness/anti-patterns.md
- Flag any detected failure modes
- For L1/L2 diamonds: also check for system archetypes (Senge) — Fixes That Fail, Shifting the Burden, Limits to Growth, Eroding Goals
- At L3->L4 transitions: also run the Design Completeness Check (quality/CLAUDE.md) to verify all layers of the product design stack have evidence. Source: Mill, building on Garrett.
Check canvas health:
- Run the /mycelium:canvas-health checks inline: missing required files, stale confidence, inconsistent evidence types
- Report any critical or warning-level findings
- This catches silent canvas degradation before it affects progression decisions

6b. Check metric snapshot freshness (v0.14; L0/L1/L2/L5 only):

If the current diamond scale is L0, L1, L2, or L5 AND .claude/jit-tooling/active-metrics.yml exists:
- For each status: active source, find the newest file in .claude/evals/metrics/<source>/.
- If the newest snapshot is >7 days old (or missing entirely), flag as a warning and recommend /mycelium:metrics-pull.
- If .claude/jit-tooling/active-metrics.yml is missing, recommend /mycelium:metrics-detect (softer — info-level, not a gate).
Rationale: evidence loops for Purpose/Strategy/Opportunity/Market depend on external signal freshness. A stale snapshot silently anchors confidence.
Do NOT block progression on stale snapshots — this is a NUDGE, not a gate.

Check corrections.md:
- Any relevant past mistakes to avoid?

7b. Check trio perspective coverage (Torres Product Trio):

For the current diamond phase, verify all three perspectives (product/design/engineering) have been applied.
Reference ${CLAUDE_PLUGIN_ROOT}/engine/theory-gates.md §Trio Perspective Requirement for the per-scale coverage matrix.
Flag any missing perspectives as a gap: "Design perspective not yet applied at L[X]. Consider running /mycelium:usability-check or /mycelium:service-check."
If perspectives are in conflict, recommend ${CLAUDE_PLUGIN_ROOT}/engine/perspective-resolution.md.

Coaching check (Rother's Coaching Kata): Surface these five questions in the output to prompt the human's thinking:
1. What is the target condition for this diamond? (What does "done" look like?)
2. What is the actual condition right now? (Summarize from steps 2-7 above)
3. What obstacles are preventing progress? Which one are you addressing now?
4. What is your next step? What do you expect will happen? (Force a prediction before acting)
5. When can we check what we learned from that step? (Commit to a review point) The coach (human) should answer these, not the agent. The agent surfaces them. Source: Rother (Toyota Kata) — the 5 questions install scientific thinking as a daily habit.
Log assessment in .claude/harness/decision-log.md (MANDATORY):
- APPEND a ### Diamond Assessment entry to .claude/harness/decision-log.md
- Include: diamond ID and scale, gates passed/failed, current confidence with rationale, evidence gaps
- This log entry is essential for auditability — every assessment should be documented
Recommend next action:

If all gates pass and confidence meets threshold: recommend transition to next phase
If gates fail: recommend specific actions to address failures
If confidence is low: recommend evidence-gathering activities
If anti-patterns detected: recommend corrective actions
If regression needed: recommend which phase to return to and why

Play devil's advocate: Before recommending progression, ask:
- What are we most likely wrong about?
- What evidence have we dismissed?
- Is there a simpler path we're overlooking?
Report harness thickness (informational):
- Count: total skills, active guardrails, mandatory reads, hooks, theory gates
- Current: 49 skills, 37 guardrails, 4 mandatory reads, 5 hook layers, 13 gates
- If thickness has increased since last assess, note it
- This is observability, not a gate — purely informational
- Source: Trivedy (Anatomy of an Agent Harness, LangChain blog — "scaffolding should decrease as models improve," but harnesses remain valuable as they engineer systems around model intelligence)

Output Format

ALWAYS output in plain language first, then technical details. Use ${CLAUDE_PLUGIN_ROOT}/engine/status-translations.md for translations.

ALWAYS render the journey map first. Follow `${CLAUDE_PLUGIN_ROOT}/engine/wayfinding.

diamond-assess

How to add

Drop this on your repo README

Related skills

claude-api

skill-creator

claude-mem

oh-my-issues

Get new Desenvolvimento skills every Monday

Diamond Assess Skill

Workflow

Output Format

Comments · No comments