Error Detective - Systematic Debugging and Error Resolution
Overview
Error Detective is a comprehensive debugging skill that applies systematic methodologies to identify, analyze, and resolve errors efficiently. Using the TRACE framework and structured analysis techniques, this skill guides you through debugging from initial error discovery to verified resolution.
Core Capabilities
Stack Trace Analysis
- Parse and interpret stack traces across multiple languages
- Identify root cause vs. symptom errors
- Extract relevant file paths and line numbers
- Understand call chains and error propagation
Error Pattern Recognition
- Categorize errors by type (syntax, runtime, logic, integration)
- Identify common error patterns and anti-patterns
- Recognize framework-specific errors
- Map errors to likely root causes
Root Cause Analysis
- Distinguish between symptoms and underlying issues
- Follow error chains to original source
- Identify environmental vs. code issues
- Detect configuration and dependency problems
Debugging Workflow Management
- Structured investigation process
- Hypothesis generation and testing
- Iterative refinement of understanding
- Documentation of findings and solutions
The TRACE Framework
TRACE is a systematic five-step approach to debugging any error:
T - Trace the Error
Objective: Capture complete error information and context
-
Collect the full error message
- Complete stack trace (not just first few lines)
- Error type and message
- Timestamp and occurrence frequency
- Environment where error occurred
-
Identify error location
- Exact file and line number
- Function or method where error occurred
- Code context (surrounding lines)
- Call stack from entry point to error
-
Gather reproduction steps
- Minimal steps to reproduce
- Input data or parameters used
- Expected vs. actual behavior
- Consistency of reproduction (always, intermittent, rare)
R - Read the Error Message
Objective: Extract all information from the error itself
-
Parse error components
- Error type/class (TypeError, ValueError, etc.)
- Error message content
- Suggested fixes (if provided)
- Related errors or warnings
-
Understand error semantics
- What the error type means in this language/framework
- What conditions trigger this error
- What the error message is specifically telling you
- Any error codes or status codes
-
Identify error category
- Syntax error (code won't parse)
- Runtime error (code crashes during execution)
- Logic error (wrong results, no crash)
- Integration error (external system failure)
- Performance error (timeout, resource exhaustion)
A - Analyze the Context
Objective: Understand the broader context around the error
-
Code analysis
- Review the failing line and surrounding code
- Check recent changes to this code
- Examine function/method signature and usage
- Review related code that calls or is called by failing code
-
Data analysis
- Inspect input values at point of failure
- Check data types and structures
- Verify data meets expected format/constraints
- Identify edge cases or unexpected values
-
Environment analysis
- Check dependencies and versions
- Review configuration files
- Verify environment variables
- Confirm required resources are available (files, network, memory)
-
State analysis
- Application state at time of error
- Previous operations that led to this state
- Shared state or global variables involved
- Database or external system state
C - Check for Root Cause
Objective: Identify the underlying issue, not just symptoms
-
Follow the error chain
- Start at bottom of stack trace (first error)
- Work up to find originating cause
- Distinguish between error origin and error handlers
- Identify wrapped or re-thrown errors
-
Test hypotheses
- Generate specific, testable hypotheses
- Isolate variables (change one thing at a time)
- Use logging/debugging tools to verify assumptions
- Document which hypotheses are confirmed or rejected
-
Common root causes
- Null/undefined values: Missing initialization or validation
- Type mismatches: Incorrect data type passed or returned
- Off-by-one errors: Array/loop boundary issues
- Race conditions: Timing-dependent failures
- Resource exhaustion: Memory, disk, connections depleted
- Configuration errors: Wrong settings or missing config
- Dependency issues: Version conflicts or missing libraries
- Permission errors: Insufficient access rights
- Network errors: Connectivity, timeout, DNS issues
- Data corruption: Invalid or unexpected data format
E - Execute the Fix
Objective: Implement and verify the solution
-
Design the fix
- Address root cause, not symptoms
- Consider side effects and edge cases
- Plan for backward compatibility if needed
- Choose most maintainable solution
-
Implement carefully
- Make minimal, targeted changes
- Add validation and error handling
- Include logging for future debugging
- Document the fix and reasoning
-
Verify thoroughly
- Confirm original error is resolved
- Test with reproduction steps
- Test edge cases and related functionality
- Verify no new errors introduced
-
Document and prevent
- Document what caused the error
- Document the solution and why it works
- Add tests to prevent regression
- Update documentation or add warnings if needed
Debugging Workflow
Initial Assessment (5 minutes)
1. Read complete error message
2. Identify error type and severity
3. Check if error is reproducible
4. Assess impact (blocking, degraded, cosmetic)
5. Decide investigation priority
Deep Investigation (15-30 minutes)
1. Apply TRACE framework systematically
2. Use debugging tools (see scripts/debug_helper.py)
3. Generate and test hypotheses
4. Document findings as you go
5. Narrow down to root cause
Solution Implementation (varies)
1. Design fix addressing root cause
2. Implement with proper error handling
3. Add logging and validation
4. Test thoroughly
5. Document solution
Verification and Prevention (10 minutes)
1. Verify fix with original reproduction steps
2. Test related functionality
3. Add regression tests
4. Update documentation
5. Deploy and monitor
Common Error Patterns by Language
Python
AttributeError: 'NoneType' has no attribute 'X'
- Root cause: Variable is None when expecting object
- Check: Initialization, function return values, API responses
- Fix: Add null checks, ensure proper initialization
KeyError: 'key_name'
- Root cause: Dictionary missing expected key
- Check: Data source, parsing logic, key spelling
- Fix: Use .get() with default, validate data structure
ImportError / ModuleNotFoundError
- Root cause: Module not installed or not in path
- Check: requirements.txt, virtual environment, PYTHONPATH
- Fix: Install missing package, fix import path
IndentationError
- Root cause: Inconsistent spacing (tabs vs spaces)
- Check: Editor settings, copied code
- Fix: Standardize to spaces (PEP 8), use linter
JavaScript/TypeScript
TypeError: Cannot read property 'X' of undefined
- Root cause: Accessing property on undefined object
- Check: Object initialization, async timing, API responses
- Fix: Optional chaining (?.operator), null checks
ReferenceError: X is not defined
- Root cause: Variable used before declaration or out of scope
- Check: Variable declaration, scope, hoisting issues
- Fix: Declare variable, fix scope, check imports
Promise rejection / Uncaught (in promise)
- Root cause: Async operation failed without catch handler
- Check: API calls, file operations, async/await usage
- Fix: Add .catch(