Systematic Debug Fix
Investigate first. Fix second. Lock down with tests third.
Do not propose or implement a fix until there is evidence for the root cause.
Workflow
1. Frame the Work
Start by deciding whether the request is:
- a single issue that still needs root-cause analysis
- multiple known issues that need tracking and cleanup
- a mixed case where some issues are known and some still need diagnosis
If more than one issue exists, create a tracker before editing code. Use one row per issue, not umbrella buckets.
Track at least:
IDSymptomWhereSeverity or dependencyRoot causeStatusVerificationRegression coverage
Use labels such as BUG-1, BUG-2.
Order work by dependency first, then severity.
2. Investigate Root Cause
For the current issue, complete investigation before fixing:
- Read the error, stack trace, failing assertion, or user report carefully.
- Reproduce the issue consistently. If reproduction is unstable, gather more evidence before changing code.
- Check recent changes, config differences, dependency changes, and environment differences.
- Trace the bad value or bad state backward until the original trigger is found.
- Compare the broken path with a working example or reference implementation when one exists.
- Write one explicit hypothesis:
I think X is failing because Y.
When the system crosses component boundaries, add instrumentation first. Log what enters and exits each layer so the failing boundary is visible before choosing a fix.
If the issue appears deep in the stack, fix at the source, not at the point where the symptom explodes.
Use the bundled references when the bug matches these patterns:
- Read
references/root-cause-tracing.mdwhen the symptom appears deep in the stack and the bad input must be traced backward. - Read
references/defense-in-depth.mdwhen the root cause involves invalid data, unsafe operations, or missing guards across layers. - Read
references/condition-based-waiting.mdwhen a flaky test or async flow currently relies on sleeps, timeouts, or guessed timing.
Fixing Rules
- Do not stack multiple speculative fixes.
- Do not say "probably" and edit anyway.
- Do not hide uncertainty. If something is unclear, keep investigating.
- Do not batch unrelated fixes into one pass.
If one attempted fix fails, return to investigation with the new evidence.
If three fix attempts have failed, or each fix reveals a new symptom in a different area, stop and question the architecture before continuing.
3. Create a Reproduction Before the Fix
Create the smallest failing test or stable reproduction that proves the issue exists.
Prefer:
- an automated test in the project's existing framework
- a targeted repro command or script when no test harness exists
- a manual reproduction only when automation is genuinely impractical
The reproduction should fail before the fix and pass after the fix.
4. Fix One Issue at a Time
Apply the smallest change that addresses the confirmed root cause.
For each issue:
- Make one focused fix.
- Run the narrowest relevant reproduction or failing test.
- Run broader validation for the touched area so the fix does not break neighbors.
- Update the tracker with what changed and how it was verified.
- Move to the next issue only after the current one is verified.
If a fix uncovers a separate problem, add a new tracker item instead of folding it into the current one.
Avoid "while I'm here" refactors unless they are required to complete the root-cause fix safely.
5. Lock Down Regressions
After the fixes are verified, add automated regression coverage for every completed issue by default.
For each issue, add:
- one exact reproduction test for the original failure
- boundary variants around the failure edge
- sibling coverage for the same bug family elsewhere in the codebase
Exception path (use only when automated regression coverage is genuinely impractical):
- document why automation is impractical in this environment right now
- preserve the strongest available reproduction evidence for the fixed behavior
- record the manual or command-based verification that was run instead
- list follow-up needed to make the behavior testable later when applicable
Common bug families:
- input validation
- state management
- boundary conditions
- data flow
- API contracts
- error handling
- concurrency
Prefer real code paths over mock-heavy tests. If the project has both unit and integration layers, place regression tests at the highest level that would have caught the original bug.
Use descriptive test names that explain the behavior being protected.
6. Verify the Lock-Down
For each new test:
- confirm it passes with the fix
- explain why it would fail without the fix, or demonstrate that failure if practical
- run the relevant suite to ensure the new coverage does not conflict with existing tests
Finish with a coverage check:
- every tracker item is resolved or explicitly deferred
- every completed fix has direct automated regression coverage, or a justified exception record with durable evidence
- every important bug family has boundary or sibling coverage where justified
Red Flags
Stop and return to investigation if you catch yourself doing any of these:
- proposing a fix before tracing the data flow
- making multiple changes to "see what works"
- adding sleeps or retries without proving timing is the real cause
- skipping reproduction because the fix "looks obvious"
- claiming success without test or command evidence
- continuing past repeated failed fixes without questioning the design
Output Contract
When using this skill, keep the work visible:
- Show the issue tracker when multiple items exist.
- State the current hypothesis before the first fix for an issue.
- Record what verification ran after each fix.
- Summarize added regression coverage by issue and bug family.
- Call out anything still unverified or deferred.
Completion Standard
Do not declare the work done until:
- the root cause is identified or the remaining uncertainty is explicitly documented
- each fix is verified with evidence
- automated regression coverage exists for the completed issues, or a justified exception includes durable verification evidence and follow-up where applicable
- unresolved risks or follow-ups are listed clearly
Integration with Pulse
Can be invoked during pulse:executing for beads that involve bug fixes or defect resolution. When used within Pulse, feed the fix tracker output and regression test results into the bead's verification evidence and learning_refs for pulse:compounding.