Test Mutate - Mutation Testing Workflow
Systematically introduces small changes (mutations) to source code, runs the test suite after each, and reports which mutations survive (tests don't catch them). Surviving mutations reveal genuine test coverage gaps that line coverage misses.
Philosophy
Mutation score > line coverage. A test that executes code but doesn't assert on results gives 100% line coverage and 0% mutation score. Mutation testing answers the real question: if a bug were introduced here, would the tests catch it?
Multi-session by design. Mutation testing is slow — each mutation requires a full test run. Progress is tracked in .test-mutations.json so you can work through a codebase incrementally across sessions.
Autopilot by default. After initial setup, the workflow runs unattended through all in-scope modules. It addresses all surviving mutations, commits after each module, and moves on. Human intervention is only needed during setup (scope selection, test command verification) and if an unrecoverable error occurs.
Workflow Overview
┌─────────────────────────────────────────────────────┐
│ TEST MUTATE │
├─────────────────────────────────────────────────────┤
│ SETUP (interactive) │
│ 1. Initialize (load or create tracking file) │
│ 2. Detect test command (first run only) │
│ 3. Determine scope (user selects, default: all) │
│ │
│ EXECUTION (autopilot — no user interaction) │
│ For each module in scope: │
│ 4. Spawn qa-test-mutator agent │
│ 5. Update tracking file with results │
│ 6. Spawn SME to write tests for ALL survivors │
│ 7. Verify (new tests pass + re-mutate confirms) │
│ 8. Commit changes │
│ 9. Final summary │
└─────────────────────────────────────────────────────┘
Workflow Details
1. Initialize
Check for .test-mutations.json in the project root.
If the file exists:
- Load tracking data
- Show progress summary: X/Y modules tested, overall mutation score Z%
- List modules by status (completed, in-progress, pending)
- Proceed to step 3 (scope selection)
If the file doesn't exist:
- This is a first run — proceed to step 2 (test command detection)
- After detecting the test command, discover source files (see Module Discovery below)
- Create the tracking file with initial structure
- Ask user: "Should I commit
.test-mutations.jsonto version control, or add it to.gitignore?"
2. Detect Test Command
Try in order:
Makefilewith atesttarget →make testpackage.jsonwith atestscript →npm testgo.modpresent →go test ./...pyproject.tomlorpytest.iniorsetup.cfgwith pytest config →pytestCargo.toml→cargo testbuild.gradleorbuild.gradle.kts→gradle test
If none detected: Ask the user: "What command runs your test suite?"
Verify the command works by running it once. If it fails, report the error and ask the user for the correct command.
Store the test command in the tracking file.
Module Discovery
Use Glob to find source files in the project. Exclude:
- Test files (
*_test.go,test_*.py,*.test.js,*.spec.ts, etc.) - Vendor/dependency directories (
vendor/,node_modules/,.venv/,target/) - Generated files (files with generation markers like
// Code generated) - Configuration files, documentation, assets
For each source file, attempt to identify covering test files using naming conventions:
auth.go→auth_test.goauth.py→test_auth.pyorauth_test.pyAuth.ts→Auth.test.tsorAuth.spec.ts
Store discovered modules in the tracking file with status pending.
3. Determine Scope
Present the current state to the user:
## Mutation Testing Progress
Overall: 3/10 modules tested (mutation score: 87%)
### Completed
- src/auth.go — score: 100% (45 mutations)
- src/config.go — score: 92% (24 mutations, 2 survivors)
### In Progress
- src/payment.go — score: 80% (20/50 mutations tested)
### Pending
- src/api/handler.go
- src/models/user.go
- src/utils/parser.go
- ...
Scope? Enter file names/numbers, or press Enter to test all pending modules.
User can:
- Pick specific files by name or number (e.g., "1, 3, 5" or "src/auth.go")
- Resume an in-progress file
- Re-test a completed file (useful after adding tests)
- Press Enter / say "all" to test all pending modules (this is the default)
Default: All pending modules, processed in alphabetical order. If a module is in-progress, it is processed first.
After scope is confirmed, the workflow enters autopilot mode. No further user interaction occurs until the run completes or an unrecoverable error is encountered.
Steps 4-8 repeat for each module in scope (autopilot)
4. Spawn Mutator Agent
Spawn a qa-test-mutator agent with the selected source file and test command:
Apply mutation testing to the following source file:
- Source file: [path]
- Test command: [command]
Systematically mutate the source code, run tests after each mutation,
and report which mutations are killed vs survived.
Wait for the agent to complete and collect its results.
5. Update Tracking File
Parse the mutator agent's results and update .test-mutations.json:
- Set module status to
completed(orin_progressif the agent reported partial coverage) - Populate
mutations_by_typewith results grouped by mutation type - Store surviving mutation examples in the
examplesarrays - Calculate
mutation_scorefor the module - Update
global_statistics(recalculate totals and overall score) - Set
last_updatedtimestamp
Write the updated tracking file.
6. Spawn SME to Write Tests
If no survivors (100% mutation score): Log the result and proceed to step 8 (commit).
If survivors exist: Write tests for ALL surviving mutations.
Detect the appropriate SME based on project language:
- Go →
swe-sme-golang - Dockerfile →
swe-sme-docker - Makefile →
swe-sme-makefile - GraphQL →
swe-sme-graphql - Ansible →
swe-sme-ansible - Zig →
swe-sme-zig - TypeScript →
swe-sme-typescript - JavaScript →
swe-sme-javascript - HTML →
swe-sme-html - CSS →
swe-sme-css
For languages without a dedicated SME, implement the tests directly as orchestrator.
Prompt the SME with:
Write tests to catch the following surviving mutations in [source file]:
1. Line 42: `amount + fee` → `amount - fee`
The test must verify that the fee is ADDED, not subtracted.
2. Line 78: `price > 0` → `price >= 0`
The test must verify behavior when price is exactly 0.
3. Line 103: `valid && active` → `valid || active`
The test must verify that BOTH conditions are required.
Each test should:
- Target the specific behavior that the mutation would break
- Follow the project's existing test conventions
- Be focused and minimal (one assertion per concern)
- Have a clear name indicating what it verifies
7. Verify
Run the test suite to confirm new tests pass with the correct (unmutated) code.
If new tests fail: Give the SME (or yourself) one chance to fix. If still failing, log the failure, revert the failing tests (git restore), and continue with the next module.
Re-mutate to confirm kills: For each addressed surviving mutation:
- Re-apply the specific mutation (Edit)
- Run the test command
- Verify tests now FAIL (mutation is killed)
- Revert (
git restore)
If a re-applied mutation still survives: Log it as an unresolved survivor and continue. Do not retry or prompt the user.
Update the tracking file: mark confirmed kills, update mutation score.
8. Commit
Automatically commit the changes for this module:
git add [test files] .tes