Test Review — Comprehensive Test Suite Survey
Five-phase survey: unit coverage gaps, integration coverage gaps, E2E (browser) coverage gaps when applicable, fuzz coverage gaps, then test quality issues. Each phase runs its analysis and contributes findings to a consolidated report; at the end, the skill proposes a ticket structure for the recommended work and creates tickets after operator approval.
Advisory only. The skill produces findings and proposes tickets; it does not implement test changes. The cognitive seam between "find a coverage gap" and "design a test for it" is wide enough that mixing them under one workflow degrades both — test design requires fresh reasoning about edge cases, mocking strategy, and assertion shape, and the discovery agents shouldn't be biased toward gaps whose fixes are easy. Tickets capture findings durably across that seam and compose with /implement and /implement-project for remediation.
The same logic applies in reverse to test-quality findings (DELETE / REWRITE / SIMPLIFY): the operator should approve removing or rewriting an existing test explicitly, via a ticket, rather than have a workflow do it as a side effect of running a review.
Philosophy
Tests are a system, not a checklist. Unit gaps, integration gaps, E2E gaps, fuzz gaps, and bad tests are different facets of the same problem: the test suite isn't doing its job. This workflow surveys all of them in deliberate order — inside-out by test scope (unit → integration → E2E), then fuzz as an addendum, then quality covers everything that exists today.
Workflow Overview
┌──────────────────────────────────────────────────┐
│ TEST REVIEW │
├──────────────────────────────────────────────────┤
│ 1. Determine scope │
│ 2. Phase 1: Unit coverage gaps │
│ 3. Phase 2: Integration coverage │
│ 4. Phase 3: E2E coverage (webapps only) │
│ 5. Phase 4: Fuzz coverage │
│ 6. Phase 5: Test quality audit │
│ 7. Present consolidated findings │
│ 8. Cut tickets (proposed structure, operator- │
│ approved) │
└──────────────────────────────────────────────────┘
Workflow Details
1. Determine Scope
Ask the user: "What should I review?"
Present these options:
- Entire project: Review all source and test files (default)
- Specific directory: A path like
src/,pkg/,lib/ - Specific files: Individual source files
- Recent changes: Files modified on the current branch (via
git diff)
Default: Entire project.
If the project is large (many source files), suggest narrowing scope. The user can always re-run on a different scope.
This scope applies to all five phases.
Phase 1: Unit Coverage Gaps
Survey missing unit-level test coverage, prioritized by risk.
1a. Detect/Obtain Coverage Data
Follow this waterfall — stop at the first step that produces a usable report.
Step A: Check for existing coverage artifacts
Search for coverage files in common locations:
| Format | Files to search for |
|---|---|
| Go | coverage.out, cover.out, c.out |
| lcov | lcov.info, coverage/lcov.info |
| Istanbul/nyc | coverage/coverage-summary.json, coverage/coverage-final.json, .nyc_output/ |
| coverage.py | coverage.xml, coverage.json, htmlcov/ |
| JaCoCo | target/site/jacoco/jacoco.xml, build/reports/jacoco/*/jacoco.xml |
| Cobertura | coverage.xml, cobertura.xml |
If a report is found, verify it's reasonably recent (warn if older than the most recent source change). Use the report and proceed.
Step B: Detect coverage command
If no report exists, detect how to generate one:
Makefilewith acoverorcoveragetarget →make cover(ormake coverage)package.jsonwith acoveragescript →npm run coveragego.modpresent →go test -coverprofile=coverage.out ./...pyproject.toml/setup.cfg/pytest.iniwith coverage config →pytest --cov --cov-report=jsonCargo.toml→cargo tarpaulin --out json(orcargo llvm-cov --json)build.gradle/build.gradle.kts→gradle jacocoTestReport
Run the command and verify it produces a report. If it fails, ask the user for the correct command.
Step C: Ask the user
If no coverage tooling is detected: "What command generates a coverage report for this project?"
Step D: Manual analysis fallback
If no coverage tooling is available, proceed with manual analysis. The agent will read source and test files to identify gaps by inspection.
Note: In manual analysis mode, quantitative coverage measurement is unavailable.
Store: the coverage command (if any) and baseline coverage percentage.
1b. Analyze Coverage Gaps
Assess scope size with Glob.
Small scope (roughly ≤15 source files): Spawn a single qa-test-coverage-reviewer agent with the full scope and coverage data.
Large scope (roughly >15 source files): Partition by directory or module. Spawn multiple qa-test-coverage-reviewer agents in parallel, each with a focused partition and relevant coverage data.
Merge findings into a single list ordered by priority tier (CRITICAL → HIGH → LOW). Collect REFACTOR-FOR-TESTABILITY suggestions separately — these are presented in the consolidated findings, not as ticket candidates by default.
Prompt for each agent:
Analyze test coverage gaps.
Scope: [partition or full scope]
Mode: [coverage report / coverage command / manual analysis]
Coverage data: [file path or "manual analysis — no data"]
Identify:
- Untested code paths prioritized by risk (CRITICAL / HIGH / LOW)
- Code that is structurally hard to test (REFACTOR-FOR-TESTABILITY suggestions)
Return structured findings with ADD recommendations and refactoring suggestions.
If no significant gaps found: Record "No significant coverage gaps found" and proceed to Phase 2.
1c. Record Phase 1 Findings
Record findings grouped by priority tier (CRITICAL / HIGH / LOW) for the consolidated report in step 7. Hold the REFACTOR-FOR-TESTABILITY suggestions separately — they appear as an informational section in the final report; they may or may not be cut as tickets per the runtime ticket-structure proposal in step 8.
Proceed to Phase 2.
Phase 2: Integration Coverage
Survey integration test coverage and identify gaps or, if none exist, a starter strategy.
2a. Analyze Integration Coverage
Spawn a single qa-test-integration-reviewer agent.
Prompt:
Review integration test coverage for this project.
Scope: [full scope from step 1]
Detect:
- Existing integration test infrastructure (frameworks, directories, markers, runners, fixtures, CI)
- Integration seams (databases, queues, external APIs, etc.)
If no integration tests exist (Mode A), recommend a starter strategy with infrastructure
and ~5-8 starter tests. If integration tests exist (Mode B), identify gaps within the
strategy (cap ~10) and missing strategies (cap ~3).
Return findings per the agent's output format, with calibrated confidence.
2b. Record Phase 2 Findings
The agent reports in one of two modes.
Mode A (no integration tests detected): the agent proposes a starter strategy with infrastructure and starter tests. Record the strategy, infrastructure proposal, and starter tests.
Mode B (integration tests detected): the agent reports gaps within the strategy and strategy-expansion opportunities. Record them with their