SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

debug-driven

Desenvolvimento

Activates a structured, hypothesis-driven debugging loop. Use this skill whenever the user reports a bug, unexpected behavior, or runtime error and wants help diagnosing the root cause - especially for bugs that are hard to reproduce or where the fix isn't obvious. Trigger on phrases like: "debug this", "figure out why X is happening", "something is wrong with Y", "I have a bug", "debug-driven", o

1estrelas
Ver no GitHub ↗Autor: rgnicoaraLicença: MIT

Debug-Driven

A structured, hypothesis-driven debugging loop. Instead of immediately guessing a fix, this skill drives a disciplined hypothesis -> instrument -> reproduce -> analyze -> fix -> verify -> cleanup cycle.


Core Philosophy

Never guess. Instrument, observe, then fix.

The best debuggers do not immediately patch code. They:

  1. Form multiple hypotheses about what could be wrong
  2. Instrument the code to test each hypothesis with real runtime data
  3. Let the human reproduce the bug (they are in the loop, not the AI)
  4. Analyze the evidence and converge on a root cause
  5. Apply a targeted fix and ask for confirmation
  6. Clean up all instrumentation

This skill enforces that workflow.


Phase 0: Bug Intake

Before doing anything else, gather what you know. Extract from the user's message:

  • Symptom: What is actually happening?
  • Expected: What should happen instead?
  • Reproduction steps: How to trigger it?
  • Stack / environment: Language, framework, relevant files?
  • Error output: Any logs, stack traces, error messages already available?

If reproduction steps are missing or unclear, ask. You cannot proceed without them.

Reproduction scripting: If the bug can be triggered by a deterministic sequence (API call, test case, CLI command, browser automation script), write a reproduction script during intake. This script can be re-run by the agent in later cycles without requiring the user to manually reproduce each time. The user should still manually verify the final fix (Phase 6), but intermediate reproduction cycles (Phase 3) can use the script.


Hypothesis Labeling Rules

Hypothesis labels are session-global and monotonic:

  • Start the first set at H1
  • Never reuse a label within the same debug session
  • After a failed verification, continue from the highest label used so far (H4, H5, ...) rather than restarting at H1
  • If an older hypothesis remains relevant, keep its original label instead of renumbering it

Keep track of the active hypotheses for the current cycle. An active hypothesis is any hypothesis that is still INCONCLUSIVE or newly introduced and not yet ruled out.

New hypothesis labels may ONLY be introduced in two places:

  1. A Phase 1 cycle (initial or after all hypotheses are ruled out)
  2. The "Path Forward" section of the Failed-Verification Recovery Template

In both cases, the full Phase 1 format (label + Mechanism + Confirm + Rule out) is required before the label exists. You cannot introduce a new Hn during Phase 4 analysis, Phase 5 fix, or inline in instrumentation code. If analysis reveals a new theory, note it in prose ("this suggests the issue may be in X") and then formally open a Phase 1 cycle to define it.


Phase 1: Hypothesis Generation

Code reading before hypothesizing

Before generating hypotheses, read enough code to form grounded theories — not just the file mentioned in the bug report:

  • Start at the symptom: read the code where the bug manifests (the reported file/function)
  • Trace one level out: follow the call chain — who calls this function? What does it call? Read those callers/callees
  • Check data flow: if the bug involves wrong values, trace where those values originate (config, DB query, API response, user input)
  • Look for relevant state: if the component has initialization, lifecycle hooks, or caching, read those paths — bugs often hide in setup code, not in the main logic

Stop when you can articulate at least 3 structurally distinct theories about what could be wrong. You do not need to read the entire codebase — just enough that your hypotheses are grounded in actual code paths, not pure speculation.

Generating hypotheses

Generate at least 3 distinct hypotheses about what could be causing the bug. There is no hard upper limit, but each hypothesis must be structurally distinct — do not pad the list with variations of the same theory.

Format each hypothesis using exactly this structure. Do NOT use numbered lists, paragraphs, headings, or any other layout. Use * bullet prefix and indented sub-fields:

* H1: [Short label — max ~10 words, e.g. "Off-by-one in pagination cursor"]
  - Mechanism: [1-2 sentences: how this fault causes the observed symptom]
  - Confirm: [What specific log values or behavior would prove this fault exists]
  - Rule out: [What specific log values or behavior would prove this fault does NOT exist]

Example:

* H1: Discount rate read from stale cache entry
  - Mechanism: The pricing service caches discount rates for 5 minutes. If the rate
    was updated after the cache was populated, the old rate is applied to new orders.
  - Confirm: Log shows discount_rate=0.0 at cart.js:44 despite DB having rate=0.1
  - Rule out: Log shows discount_rate matches the current DB value

Writing precise Confirm / Rule-out criteria: Criteria must name specific expected values or value ranges, not ambiguous states. The agent doing the verdict check will compare log output literally against your criteria — if the criteria say "is false" but the log shows undefined, that is not a match. Write criteria that account for every value the code can actually produce.

  • Good: initialDataFetched is falsy (false, undefined, or null) — covers all cases
  • Good: discount_rate === 0.0 — exact value
  • Bad: initialDataFetched is false — what if it's undefined? The verdict becomes ambiguous
  • Bad: the value is wrong — not specific enough to check mechanically

If you are unsure which exact value to expect, use a range or disjunction (X is one of [a, b, c]) rather than guessing a single value.

Rules:

  • Order by plausibility (most likely first)
  • Cover structurally distinct failure modes - do not list variations of the same root cause
  • At least one hypothesis should be in the reported area, and at least one should cover an upstream or downstream assumption you are NOT sure about
  • Do not propose a fix yet - this is purely diagnostic
  • Every hypothesis must describe a specific fault — something wrong that causes the symptom. "The recount triggers correctly" or "trace the execution flow" are not hypotheses. If it cannot be phrased as "the bug is caused by [specific fault]", rewrite it or drop it.
  • Do NOT offer to skip instrumentation. No matter how confident you are from reading the code, you must instrument and observe runtime evidence before proposing a fix. Code reading produces hypotheses, not conclusions.
  • Do NOT shortcut the logging method. Use the correct method for the project type (file-append for non-browser, fetch-to-ingest-server for browser). Do not substitute console.log to "move faster" or because you are confident - follow the documented flow.

Present the hypotheses to the user as context for what you are about to instrument, then immediately proceed to Phase 2 in the same response. Do not pause, do not ask for feedback, do not ask "what should we do next?", do not offer alternative courses of action. The user will have a chance to intervene during Phase 3 (reproduction) if any hypothesis is wrong.


Phase 2: Instrumentation

Design and inject logging statements that will generate evidence for or against each hypothesis.

Log output method

All debug instrumentation logs to a file (./debug-output.log by default) rather than to stdout or console. This lets the agent read the log file directly after reproduction.

  • Non-browser apps -> file-append using the language's native API. See log-output-recipes.md for one-liners and full examples per language.
  • Browser apps -> fetch() to the HTTP log ingest server bundled with this skill. See log-output-recipes.md for server launch procedure and fetch pattern.

You MUST use the method above for the project type. Do NOT substitute console.log because it seems eas

Como adicionar

/plugin marketplace add rgnicoara/debug-driven

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.