Lead-Bug-Hunt — Autonomous Bug Elimination
Drives a codebase toward "no bugs above severity floor" without operator involvement between startup and termination. The operator states scope, severity floor, constraints, and finisher preference at startup; the skill then loops /bug-hunt → triage → /implement-batch until two consecutive hunt passes produce no findings above the floor, or until a hard cap or andon cord halts the run.
This skill is a narrower sibling of /lead-project. /lead-project takes open-ended commander's intent and decides which skills to invoke from a broad repertoire. /lead-bug-hunt has a fixed loop shape (hunt → fix → re-hunt) and a bounded sub-skill repertoire. It exists because "iterate until bugs converge" is a recurring workflow worth canonizing.
Philosophy
This skill implements the autonomy discipline documented in references/autonomy.md. The shared discipline governs the five levers (altitude rule, pre-loaded options, pre-rebutted recommendation, commander's intent, risk budgets), the cascade rule, the no-unilateral-breaking-changes guardrail, and the shared handoff template. Skill-specific shape (the hunt-cycle structure, convergence criteria, severity-floor triage) is layered on top.
The loop converges on a floor, not on zero
Bugs below a stated severity floor do not block termination. They are recorded as deferred items in the completion report. The operator chooses the floor at startup; defaults to High (fix Critical and High; defer Medium and below).
This is the dominant judgment call this skill exposes. Setting the floor too low produces a loop that never converges (/bug-hunt always finds something at Low). Setting it too high produces a run that ships with real bugs unaddressed. The floor is elicited explicitly so the operator owns this trade-off.
Reproducing tests are durable artifacts
/bug-hunt commits reproducing tests as it works. Those tests outlive the run — they become permanent regression tests in the suite. Their quality matters more than typical test code because they will catch regressions for years and bias future test authors who read them as examples.
This skill therefore treats test review as part of its core contract, not as adjacent cleanup. At termination (after convergence, after any /refactor finisher), the skill invokes /review-test scoped to the test files modified during this run, auto-approves any quality-finding ticket proposals at or above the severity floor, and fixes them via /implement-batch. Findings below the floor are deferred to the completion report.
Trust the reproducing test, escalate the disagreement
/bug-hunt produces findings backed by reproducing tests. This skill trusts findings with passing reproducing tests against current HEAD — they are real bugs by /bug-hunt's evidence contract. The skill does not silently dismiss findings. If the skill genuinely believes a finding is wrong (the reproducing test asserts wrong behavior, the "bug" is intentional), that is an andon trigger ("contested finding"), not a unilateral disregard.
No escape hatches: the skill cannot rationalize bugs away to make the loop converge.
Auto-approval is delegated to the autonomy discipline
/bug-hunt and /review-test are advisory; their ticket proposals are auto-approved under /lead-bug-hunt per the orchestrator-family contract documented in references/autonomy.md § "Auto-approval of sub-skill ticket proposals". The commander's-intent severity floor (field 2) is applied at the triage step (1b), not at the approval moment. The completion report lists every ticket created.
Broad authority, narrow gates
The skill may: invoke /bug-hunt, /implement-batch, /implement, /bug-fix, /think-diagnose, /refactor, /review-test (at termination, scoped to the run's new tests); create tickets via auto-approved proposals; commit fix work via sub-skills; create and modify the working branch.
The skill may NOT without explicit authorization: push or merge to main/master, force-push, propose breaking changes (see references/autonomy.md § "No unilateral breaking changes"), invoke /review-* skills other than the scoped /review-test at termination (out-of-axis — use /lead-project if you want broader review-driven work), install dependencies, run irreversible destructive operations.
Workflow Overview
┌──────────────────────────────────────────────────────────────────┐
│ LEAD-BUG-HUNT WORKFLOW │
├──────────────────────────────────────────────────────────────────┤
│ 0. Startup │
│ ├─ 0a. Branch and working-tree check │
│ ├─ 0b. Resume existing run or start fresh │
│ ├─ 0c. Elicit commander's intent (4 fields) │
│ └─ 0d. Seed LEAD_BUG_HUNT_STATE.md │
│ │
│ 1. Hunt cycle (repeat until convergence or andon cord) │
│ ├─ 1a. Hunt — invoke /bug-hunt, gather findings │
│ ├─ 1b. Triage — apply severity floor, screen for contests │
│ ├─ 1c. Decide — form batch, escalate, or terminate │
│ ├─ 1d. Act — invoke /implement-batch (or /implement, etc.) │
│ ├─ 1e. Verify — tests pass, reproducing tests now pass │
│ └─ 1f. Convergence check │
│ │
│ 2. Termination │
│ ├─ 2a. Optional /refactor finisher (if opted in) │
│ ├─ 2b. /review-test on new reproducing tests │
│ │ (auto-approved, fix above-floor findings) │
│ ├─ 2c. Final verification pass │
│ └─ 2d. Completion report │
└──────────────────────────────────────────────────────────────────┘
Workflow Details
0. Startup
Follow the shared startup protocol in references/lead-startup.md. Skill-specific values:
- 0a. Branch and working-tree check — branch-name pattern:
lead-bug-hunt/<date>(e.g.,lead-bug-hunt/2026-05-12). - 0b. Resume existing run or start fresh — state-doc filename:
LEAD_BUG_HUNT_STATE.md. "Resume as-is" semantic: re-run a hunt before forming the next batch. - 0c. Elicit commander's intent — four fields per the schema in
references/autonomy.md§ "Commander's-intent schemas per skill //lead-bug-hunt". Push-back examples specific to this skill: "Find all the bugs" is not a scope — ask which modules; "Whatever severity" is not a floor — push for Critical+High at minimum, or ask why broader makes sense. - 0d. Seed
LEAD_BUG_HUNT_STATE.md— include the four pinned intent fields, an empty cycle log, and an empty findings ledger. Gitignore the state doc per the protocol.
1. Hunt Cycle
Repeat until convergence (1f), andon cord, or hard cap (10 hunt-cycles).
Each cycle has six phases. Keep phase transitions visible in the state doc.
1a. Hunt
Invoke /bug-hunt with auto-answered prompts:
- Scope — answer from commander's intent field 1.
- Areas of concern — on cycle 1, none specified unless the operator listed them in field 1. On subsequent cycles, include any areas the previous cycle's hunters flagged but ran out of time on.
- Exclusions — answer from commander's intent field 1.
When /bug-hunt reaches its ticket-proposal step, auto-approve. Record in the cycle log: ticket IDs proposed, scope of the proposal, and the fact that auto-approval was applied per commander's intent.
/bug-hunt also commits its reproducing tests. Confirm those commits exist in the cycle log.