Ship: Implement
HOST IMPLEMENTS. PEER CROSS-VALIDATES.
EVERY FINDING NEEDS FILE:LINE + EVIDENCE.
Runtime Resolution
See ../.shared/runtime-resolution.md for the host/peer concept and
dispatch commands. In /ship:dev, the host is the primary implementer
and the peer is the independent reviewer. Prefer a non-host provider
for cross-model validation; if unavailable, use a fresh same-provider
session and record the weaker independence in the report.
Two wave shapes, different dispatch patterns:
| Wave shape | Implementer | Reviewer | Fix-round owner |
|---|---|---|---|
| Single-story (most common) | Host (you), on current branch | Peer agent | Host — you apply fixes directly |
| Multi-story parallel | Fresh Agent subagents per story, all on the current branch (dependency analysis guarantees their file scopes don't overlap — no worktrees needed) | Peer per story | Fresh Agent subagent dispatch — whoever implemented a story is who fixes it |
| Fix mode (/ship:auto review_fix/qa_fix/e2e_fix dispatch) | Host — you | (next phase re-runs its own verification) | Host — you apply fixes directly |
The independence contract — reviewer MUST differ from implementer — is strongest when it uses a different provider and a different session. If only same-provider dispatch is available, use a fresh session and make the limitation explicit.
The fix-routing rule — whoever implemented, fixes — keeps context tight. The implementer knows what they built and why; asking someone else to fix their code loses that context.
Roles
| Role | Who |
|---|---|
| Orchestrator + primary implementer | You (host agent) — implement directly in single-story waves and fix mode |
| Parallel implementer | Fresh Agent subagent — only in multi-story parallel waves, all on current branch (dependency analysis prevents file overlap) |
| Reviewer | Peer agent — fresh dispatch per story |
| Multi-story fixer | Fresh Agent subagent — dispatched when a sub-agent-implemented story needs a fix; "whoever implemented, fixes" |
Quality Gates
| Gate | Condition | Fail action |
|---|---|---|
| Spec + plan read | Acceptance criteria extracted, TEST_CMD found | AskUserQuestion |
| Implement → Review | Story produced at least one commit (from subagent report, or HEAD moved since WAVE_BASE_SHA for single-story waves) | BLOCKED |
| Review → Next story | Verdict is PASS or PASS_WITH_CONCERNS | Targeted fix (max 2) |
| All stories → Done | Full test suite passes | Targeted fix for regression |
Red Flag
Never:
- Skip the peer review — every story goes through peer review (or fallback) before the wave merges. This is the only cross-validation in the pipeline until /ship:review runs.
- Parallelize stories that share files without dependency analysis
- Re-implement a full story on FAIL — make targeted surgical fixes
- Advance to next story without getting a reviewer verdict
- Soften a test assertion to make it pass instead of fixing the code
- In multi-story waves: omit prior stories' context from each dispatched implementer prompt
- Reuse a reviewer dispatch across stories — fresh peer call each time
- Let the peer reviewer become your coder — if the reviewer suggests a fix, YOU apply it; don't ask the reviewer to write patches
Progress Tracking
Use TodoWrite to track your own progress through implementation.
Build the todo list after Phase 1 (setup), once you know the actual
wave/story structure. The items should reflect the real work — don't
use a canned template.
Principle: one todo per wave (not per story) to keep the list short.
Use activeForm to show which story within a wave is active.
Always end with a regression test item when there are multiple stories.
Example (3-wave normal run):
TodoWrite([
{ content: "Wave 1: \"Add User model\", \"Add Product model\"",
status: "in_progress", activeForm: "Implementing Story 1" },
{ content: "Wave 2: \"User API\", \"Product API\"",
status: "pending", activeForm: "Implementing Wave 2" },
{ content: "Wave 3: \"Auth middleware\"",
status: "pending", activeForm: "Implementing Wave 3" },
{ content: "Cross-story regression test",
status: "pending", activeForm: "Running regression test" }
])
Adaptations (not exhaustive — use judgment):
- Single-story task → one item for the story + one for regression, no wave labels
- Fix mode (invoked with findings) → single item:
"Fix <review/QA> findings" - Targeted fix within a wave → update that wave's
activeForm:"Fixing Story N (round R/2)" - All stories in one wave (no parallelism) → list stories individually instead of grouping by wave
Phase 1: Setup
-
Read acceptance criteria (from spec file, or derived from user request).
-
Read implementation stories (from plan file, or single story for small tasks). Accept any heading format:
## Story N,## Step N,## N. Title, or numbered/bulleted lists. Normalize as ordered stories. -
Detect the repo's test command by inspecting project root (
Makefile,package.json,pyproject.toml,go.mod,Cargo.toml, CI configs,CLAUDE.md/AGENTS.md). If none found, AskUserQuestion. Record asTEST_CMD. -
Extract code conduct from
CLAUDE.md,AGENTS.md, lint/formatter configs, and existing code patterns. Record asCODE_CONDUCT. -
Build pattern references. For each story, find the closest analogous implementation before anyone writes code:
- Search adjacent directories, feature folders, test folders, and shared component/module areas for similar files. Read the full files, not just matching snippets.
- Record 1-3 references in
<task_dir>/dev-context.mdwith: file path, why it is analogous, patterns to mirror, and intentional deviations. - Patterns to capture include import/export shape, file organization, naming, test setup, fixture style, error handling, logging, styling, theme usage, and framework-specific conventions.
- For frontend/UI work, if
DESIGN.mdexists at project root, read it and include the relevant design rules. If not, read theme/config files plus representative components before writing styles. - If no analogous file exists, record the searches performed and
none found; this is allowed, but silent skipping is not.
Pattern references are evidence, not copy-paste licenses. Mirror the local structure and conventions, but do not clone product-specific logic, stale bugs, or unrelated behavior.
-
Build story dependency graph. For each story, identify:
- Files/modules it will create or modify (from plan text)
- Explicit dependencies (e.g., "uses the model from story 1")
- Shared resources (e.g., two stories both modify the same config file)
A story depends on another if it reads/imports what the other creates, or both modify the same file. Build a DAG and topologically sort into waves — groups of stories with no dependencies between them.
Example: 5 stories Story 1: add User model → no deps Story 2: add Product model → no deps Story 3: add API for User → depends on 1 Story 4: add API for Product → depends on 2 Story 5: add auth middleware → depends on 3, 4 Waves: Wave 1: [Story 1, Story 2] ← parallel Wave 2: [Story 3, Story 4] ← parallel Wave 3: [Story 5] ← sequentialIf the plan does not provide enough information to determine file overlap, default to sequential (single story per wave). Do not guess — false parallelism causes merge conflicts.
dev-context.md format
Write <task_dir>/dev-context.md during setup and update it if fix mode
adds new pattern evidence:
# Dev Context
## Test Command
<TEST_CMD>
## Code Conduct
<CODE_CONDUCT>
## Pattern References
### Story <i>: <title>
- Reference: