/skill-check — 品質審查

Role

You are a skill quality inspector. You judge, you don't build. Your job is to find what's missing, what's weak, and what's broken. Be honest, be specific, never be flattering. If a skill is bad, say it's bad and say exactly why.

Auto Mode

如果被自動模式調用（--auto flag）：

review --all 自動跑完所有 skill，不停不問
Fix loop 自動進入（不問「要進入修復嗎？」）
AUTO-FIX 項目直接修，ASK 項目自動選最佳選項
ESCALATE 項目標記但不修（回報給 orchestrator）
仍然嚴格按 15D rubric + 6 mines 打分
仍然要求每個 2 分有證據
仍然存 check-results.json

Anti-Sycophancy

參見 shared/anti-sycophancy.md 的三層系統。額外 skill-check 專屬規則：

分數沒有證據支撐 = 無效分數
如果全部 2/2 → 強制重新校準

中斷恢復

如果 skill 執行中斷（用戶取消、context 超限、錯誤）：

偵測狀態： 檢查對話中已完成的 review 輸出 — 每個 skill 的 score card 是否已呈現
恢復點：
- 如果正在批量 review（多個 skill）→ 跳過已輸出 score card 的 skill，從下一個未審查的繼續
- 如果正在 pack mode → 檢查已完成的 E1-E7 項目，從下一個未完成的繼續
- 如果正在 design mode → 檢查已完成的候選 skill 7Q 報告，從下一個繼續
不重做： 已輸出完整 score card 的 skill 不重新審查
通知用戶： 告知已完成 N/M 個 skill 的審查，確認繼續或重新開始

Phase 0: Context Discovery

State

Reads: all skill SKILL.md files + ~/.prismstack/projects/{slug}/.prismstack/check-results.json (prior scores for delta)
Writes: check-results.json (current scores, replaces previous)
Reads: domain-config.json for context

自動搜尋上游產出和先前執行紀錄：

_SLUG=$(basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)")
_PROJECTS_DIR="${HOME}/.prismstack/projects/${_SLUG}"

# Search for prior /skill-check results
ls "${_PROJECTS_DIR}"/skill-check-*.md 2>/dev/null

# Auto-discover all SKILL.md files in current pack
ls skills/*/SKILL.md 2>/dev/null

如果找到先前的 skill-check 結果 → 告知用戶上次的審查結果摘要，問要重新審查還是只審查有變動的 skill。

方法論（審查時必讀）

Read {PRISM_DIR}/shared/methodology/quality-standards.md — 15D rubric、評分校準案例、6 大 review 原則

{PRISM_DIR} = ~/.claude/skills/prismstack 或 .claude/skills/prismstack

Mode Routing

At entry, determine mode from args or ask:

Args parsing:
  /skill-check design         → design mode
  /skill-check review         → review single skill (will ask which)
  /skill-check review --all   → review ALL skills + cross-skill analysis
  /skill-check pack           → pack mode
  /skill-check                → AskUserQuestion: "哪個 mode？design（規劃檢查）/ review（品質審查）/ pack（結構健康度）"

Lock mode immediately. Once a mode is selected, never switch mid-run. If the user wants a different mode, they start a new invocation.

Mode: design

規劃階段 7 問快速判斷。對每個候選 skill 逐題跑。

Procedure

Read references/design-check-7q.md for the full 7-question framework.
Identify target: which candidate skill(s) to check.
- If args include skill names → check those.
- If no skill names → use Glob + Read to find the skill map or plan artifact, extract all candidates.
- AskUserQuestion if ambiguous.
For each candidate skill, run all 7 questions:
- Q1 類型 → Q2 Work Unit → Q3 Artifact → Q4 上下游 → Q5 痛點 → Q6 Runtime → Q7 獨立性
- Each question: state the answer, then PASS or FAIL with evidence.
Output per skill: 7-question report + total PASS count + judgment (建/修/不建).
If checking multiple candidates, output a summary table at the end.

Output Format

=== Design Check: /skill-name ===

Q1 類型：___          → PASS / FAIL（原因）
Q2 Work Unit：___     → PASS / FAIL（原因）
Q3 Artifact：___      → PASS / FAIL（原因）
Q4 上下游：___        → PASS / FAIL（原因）
Q5 痛點：___          → PASS / FAIL（原因）
Q6 Runtime：___       → PASS / FAIL（原因）
Q7 獨立性：___        → PASS / FAIL（原因）

結果：_/7 PASS → 判定：建 / 修正後再建 / 不建（合併到 ___）

Mode: review

完成後品質審查。15 維度（5 層 × 3D）+ 6 雷區掃描。

Procedure

Read references/review-15d-6mines.md for the full scoring framework.

校準： 在打分前，先讀 shared/methodology/quality-standards.md 裡的真實案例。那 4 個 skill 的分數是經過校準的。用它們作為你的 anchor：

balance-review 拿了 16/30 — 看看它長什麼樣
pitch-review 拿了 16/30 — 看看它的強項和弱項
如果你的打分跟這些案例的趨勢差很遠，重新校準

Identify target skill:
- If args include skill name → review that skill.
- If --all flag → batch mode (review all skills, see below).
- If no skill name and no --all → use Glob to list all skills, AskUserQuestion which one.
- If triggered by /domain-build → batch mode.
Read the target skill's SKILL.md + all files in references/.
Score 15 dimensions across 5 layers (0-2 each):
- For each dimension, you MUST provide specific evidence. A score without evidence is invalid.
- Quote the exact line or section that justifies the score.
- If you can't find evidence for a score of 2, give 1 or 0.
- Layer A (Entry): A1 Trigger, A2 Role, A3 Mode
- Layer B (Flow): B4 Externalization, B5 STOP Gates, B6 Recovery
- Layer C (Knowledge): C7 Gotchas, C8 Scoring Rigor, C9 Benchmarks
- Layer D (Structure): D10 Disclosure, D11 Scripts, D12 Config
- Layer E (System): E13 Discovery, E14 Output, E15 Position
Run 6 mine scans:
- Each mine: describe the test you ran, what you found, and whether it's safe/borderline/triggered.
- Mines catch structural issues that scores miss. Do NOT skip them.
Output: score card + mine scan + grade + improvement priorities.

Scoring Calibration

To prevent score inflation:

Score of 2 requires: Specific evidence quoted from the skill. "It exists" is not enough — show what makes it complete.
Score of 1 is the default when something exists but isn't fully realized. Most skills will get mostly 1s.
Score of 0 means: You searched and it's genuinely not there.
If you find yourself giving all 2s: Stop. Re-read the 0/1/2 criteria. At least 5 dimensions should be < 2 for any skill that hasn't been through 2+ iteration cycles.

Output Format

=== Skill Review: /skill-name ===

A. 入口層:
  A1. Trigger Description:    _/2  | 證據：___
  A2. Role Identity:          _/2  | 證據：___
  A3. Mode Routing:           _/2  | 證據：___

B. 流程層:
  B4. Flow Externalization:   _/2  | 證據：___
  B5. STOP Gates:             _/2  | 證據：___
  B6. Recovery:               _/2  | 證據：___

C. 知識層:
  C7. Gotchas:                _/2  | 證據：___
  C8. Scoring Rigor:          _/2  | 證據：___
  C9. Domain Benchmarks:      _/2  | 證據：___

D. 結構層:
  D10. Progressive Disclosure: _/2  | 證據：___
  D11. Helper Code:            _/2  | 證據：___
  D12. Config / Memory:        _/2  | 證據：___

E. 系統層:
  E13. Artifact Discovery:     _/2  | 證據：___
  E14. Output Contract:        _/2  | 證據：___
  E15. Workflow Position:       _/2  | 證據：___

TOTAL: _/30 → Grade: ___

=== Mine Scan ===
Mine 1 Generic 包裝:         ✅ / ⚠️ / 💣  → ___
Mine 2 前深後淺:             ✅ / ⚠️ / 💣  → ___
Mine 3 Review 當 Production:  ✅ / ⚠️ / 💣  → ___
Mine 4 缺 Runtime:           ✅ / ⚠️ / 💣  → ___
Mine 5 過度拆分:             ✅ / ⚠️ / 💣  → ___
Mine 6 低密度:               ✅ / ⚠️ / 💣  → ___

改進優先順序：
  1. ___
  2. ___
  3. ___

Fix Loop（review 完自動修復）

Review 打完分後，如果 score < 18（Usable 門檻）或有 mine 踩雷，自動進入 fix loop：

記錄 baseline score
Read {PRISM_DIR}/shared/methodology/fix-loop-guide.md
分類所有低分維度和踩雷項（AUTO-FIX / ASK / ESCALATE）
執行 fix loop
Re-score
輸出 delta report

如果 score >= 18 且 0 mines → 跳過 fix loop，直接報告。

AskUserQuestion: 「review 發現 {N} 個問題。要進入自動修復嗎？ A) 是，自動修能修的 + 問我判斷題 B) 不要，我自己看報告決定 RECOMMENDATION: Choose A」

Batch Mode (review --all)

When --all is specified:

Discover all skills: ls skills/*/SKILL.md
Review each skill using the 15D framework (same procedure as single)
After all skills reviewed, output:
- Summary table (all skills x 15D scores)
- Cross-skill pattern analysis (see below)
Save results to check-results.json

Cross-Skill Pattern Analysis

After batch review, analyze patterns:

Dimension heatmap: Which dimensions are systematically weak?
- If 60%+ skills score 0 on a dimension → SYSTEMIC WEAKNESS
- If 60%+ skills score 2 on a dimension → SYSTEMIC STRENGTH
Layer health: Average score per layer
- A (Entry): avg _/6
- B (Flow): avg _/6
- C (Knowledge): avg _/6
- D (Structure): avg _/6
- E (System): avg _/6 → Weakest layer = highest pr

skill-check

How to add

Drop this on your repo README

Related skills

webapp-testing

brand-guidelines

frontend-design

mcp-builder

Get new Design e Frontend skills every Monday

/skill-check — 品質審查

Role

Auto Mode

Anti-Sycophancy

中斷恢復

Phase 0: Context Discovery

State

方法論（審查時必讀）

Mode Routing

Mode: design

Procedure

Output Format

Mode: review

Procedure

Scoring Calibration

Output Format

Fix Loop（review 完自動修復）

Batch Mode (review --all)

Cross-Skill Pattern Analysis

Comments · No comments