A3 Problem Analysis
Apply A3 problem-solving format for comprehensive, single-page problem documentation and resolution planning.
Description
Structured one-page analysis format covering: Background, Current Condition, Goal, Root Cause Analysis, Countermeasures, Implementation Plan, and Follow-up. Named after A3 paper size; emphasizes concise, complete documentation.
Usage
/analyse-problem [problem_description]
Variables
- PROBLEM: Issue to analyze (default: prompt for input)
- OUTPUT_FORMAT: markdown or text (default: markdown)
Steps
- Background: Why this problem matters (context, business impact)
- Current Condition: What's happening now (data, metrics, examples)
- Goal/Target: What success looks like (specific, measurable)
- Root Cause Analysis: Why problem exists (use 5 Whys or Fishbone)
- Countermeasures: Proposed solutions addressing root causes
- Implementation Plan: Who, what, when, how
- Follow-up: How to verify success and prevent recurrence
A3 Template
═══════════════════════════════════════════════════════════════
A3 PROBLEM ANALYSIS
═══════════════════════════════════════════════════════════════
TITLE: [Concise problem statement]
OWNER: [Person responsible]
DATE: [YYYY-MM-DD]
┌─────────────────────────────────────────────────────────────┐
│ 1. BACKGROUND (Why this matters) │
├─────────────────────────────────────────────────────────────┤
│ [Context, impact, urgency, who's affected] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 2. CURRENT CONDITION (What's happening) │
├─────────────────────────────────────────────────────────────┤
│ [Facts, data, metrics, examples - no opinions] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 3. GOAL/TARGET (What success looks like) │
├─────────────────────────────────────────────────────────────┤
│ [Specific, measurable, time-bound targets] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 4. ROOT CAUSE ANALYSIS (Why problem exists) │
├─────────────────────────────────────────────────────────────┤
│ [5 Whys, Fishbone, data analysis] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 5. COUNTERMEASURES (Solutions addressing root causes) │
├─────────────────────────────────────────────────────────────┤
│ [Specific actions, not vague intentions] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 6. IMPLEMENTATION PLAN (Who, What, When) │
├─────────────────────────────────────────────────────────────┤
│ [Timeline, responsibilities, dependencies, milestones] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 7. FOLLOW-UP (Verification & Prevention) │
├─────────────────────────────────────────────────────────────┤
│ [Success metrics, monitoring plan, review dates] │
└─────────────────────────────────────────────────────────────┘
═══════════════════════════════════════════════════════════════
Examples
Example 1: Database Connection Pool Exhaustion
═══════════════════════════════════════════════════════════════
A3 PROBLEM ANALYSIS
═══════════════════════════════════════════════════════════════
TITLE: API Downtime Due to Connection Pool Exhaustion
OWNER: Backend Team Lead
DATE: 2024-11-14
┌─────────────────────────────────────────────────────────────┐
│ 1. BACKGROUND │
├─────────────────────────────────────────────────────────────┤
│ • API goes down 2-3x per week during peak hours │
│ • Affects 10,000+ users, average 15min downtime │
│ • Revenue impact: ~$5K per incident │
│ • Customer satisfaction score dropped from 4.5 to 3.8 │
│ • Started 3 weeks ago after traffic increased 40% │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 2. CURRENT CONDITION │
├─────────────────────────────────────────────────────────────┤
│ Observations: │
│ • Connection pool size: 10 (unchanged since launch) │
│ • Peak concurrent users: 500 (was 300 three weeks ago) │
│ • Average request time: 200ms (was 150ms) │
│ • Connections leaked: ~2 per hour (never released) │
│ • Error: "Connection pool exhausted" in logs │
│ │
│ Pattern: │
│ • Occurs at 2pm-4pm daily (peak traffic) │
│ • Gradual degradation over 30 minutes │
│ • Recovery requires app restart │
│ • Long-running queries block pool (some 30+ seconds) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 3. GOAL/TARGET │
├─────────────────────────────────────────────────────────────┤
│ • Zero downtime due to connection exhaustion │
│ • Support 1000 concurrent users (2x current peak) │
│ • All connections released within 5 seconds │
│ • Achieve within 1 week │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 4. ROOT CAUSE ANALYSIS │
├─────────────────────────────────────────────────────────────┤
│ 5 Whys: │
│ Problem: Connection pool exhausted │
│ Why 1: All 10 connections in use, none available │
│ Why 2: Connections not released after requests │
│ Why 3: Error handling doesn't close connections │
│ Why 4: Try-catch blocks missing .finally() │
│ Why 5: No code review checklist for resource cleanup │
│ │
│ Contributing factors: │
│ • Pool size too small for current load │
│ • No connection timeout configured (hangs forever) │
│ • Slow queries hold connections longer │
│ • No monitoring/alerting on pool metrics │
│ │
│ ROOT CAUSE: Systematic issue with resource cleanup + │
│ insufficient pool sizing │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 5. COUNTERMEASURES │
├─────────────────────────────────────────────────────────────┤
│ Immediate (This Week): │
│ 1. Audit all DB code, add .finally() for connection release │
│ 2. Increase pool size: 10 → 30 │
│ 3. Add connection timeout: 10 seconds │
│ 4. Add pool monitoring & alerts (>80% used) │
│ │
│ Short-term (2 Weeks): │
│