Product Delivery System
"The goal isn't shipping. The goal is learning whether your bet was right."
This skill covers the Delivery System — how we ship, measure, and learn. It runs discovery and delivery in parallel (dual-track), ships with staged rollouts, measures with a clear hierarchy, reflects through retrospectives, and executes GTM with precision.
Part of: Modern Product Operating Model — a collection of composable product skills.
Related skills: product-strategy, product-discovery, product-architecture, ai-native-product, product-leadership
When to Use This Skill
Use this skill when:
- Planning how to roll out a new feature or product
- Designing a metrics hierarchy for a bet or product
- Running bet retrospectives after shipping
- Executing GTM launches
- Setting up dual-track development rhythm
- Deciding when to scale, iterate, or kill a bet
Cadence: Continuous | Owner: Product Trio + GTM Team
The Problem This Solves
Most teams either:
- Ship features and never measure impact
- Measure vanity metrics that don't connect to outcomes
- Do "big bang" launches that create risk
- Never officially conclude bets—zombies live forever
- Treat GTM as marketing's problem after PM ships
The Delivery System ensures shipping is the beginning of learning, not the end.
Philosophy
Core Beliefs
- Discovery and delivery run in parallel — Don't pause discovery to deliver
- Staged rollouts are the default — Ship to 10% before 100%
- Metrics exist in hierarchy — Leading → Core → Lagging
- Every bet gets a retrospective — Explicit scale/iterate/kill decision
- GTM is a product responsibility — PM owns adoption, not just availability
What This Framework Rejects
- Ship and forget (no measurement)
- Big bang launches (maximum risk)
- Vanity metrics (activity without outcome)
- Zombie bets (never concluded, never killed)
- Throwing features over the wall to marketing
Framework Components
1. Dual-Track Development
The Core Idea: Discovery and delivery happen simultaneously. While one bet is being built, the next bet is being shaped.
Week 1 Week 2 Week 3 Week 4 Week 5 Week 6
─────────────────────────────────────────────────────────
[ Discover Bet B ][ Shape Bet B ][ Discover Bet C ]
[ Build Bet A ][ Build Bet B ]
[ Ship A ] [ Ship B ]
How It Works:
| Track | Activities | Who |
|---|---|---|
| Discovery Track | Interviews, OST updates, solution exploration, assumption tests | Full trio (PM heavy) |
| Delivery Track | Building, testing, shipping, measuring | Full trio (Eng heavy) |
Time Allocation (Example):
| Role | Discovery | Delivery |
|---|---|---|
| PM | 60% | 40% |
| Designer | 50% | 50% |
| Tech Lead | 30% | 70% |
Coordination Points:
- Weekly sync: What's in flight on each track
- Handoff moment: When a bet moves from "shaped" to "building"
- Learning moment: When shipped bet results inform discovery
0→1 Mode: Tracks may blur. Everyone does everything. Speed > separation.
Scaling Mode: Clear separation. Dedicated discovery time. Research ops support.
2. Staged Rollout
The Core Idea: Never ship to everyone at once. Start small, learn, expand.
Default Rollout Stages:
| Stage | Audience | Duration | Purpose |
|---|---|---|---|
| Stage 0: Internal | Team dogfooding | 1-3 days | Find obvious bugs |
| Stage 1: Alpha | 5-10 friendly customers | 1 week | Qualitative feedback |
| Stage 2: Beta | 10% of users | 1-2 weeks | Quantitative signal |
| Stage 3: GA | 100% of users | Ongoing | Full measurement |
Progression Criteria:
| From | To | Criteria |
|---|---|---|
| Internal → Alpha | Ready for external | No P0 bugs, core flow works |
| Alpha → Beta | Validated experience | Positive qualitative feedback, no major usability issues |
| Beta → GA | Metrics acceptable | Leading metrics trending right, no guardrail breaches |
Feature Flags:
- Every significant feature ships behind a flag
- Flags enable instant rollback
- Flags enable % rollout control
- Flags are cleaned up after GA (don't accumulate debt)
Rollback Triggers:
- Guardrail metric breached
- Error rate > threshold
- Customer-reported critical issue
- Leading metrics trending wrong
0→1 Mode: Stages can be compressed. Alpha might be 3 customers for 2 days.
Scaling Mode: Formal stage gates. Release management. Beta programs.
3. Metrics Hierarchy
The Three-Tier Model:
┌─────────────────────────────────────────────────────┐
│ LAGGING METRICS │
│ (Revenue, Retention, NPS) │
│ Move slowly, hard to attribute │
├─────────────────────────────────────────────────────┤
│ CORE METRICS │
│ (Activation, Engagement, Conversion) │
│ The outcomes your bets target │
├─────────────────────────────────────────────────────┤
│ LEADING METRICS │
│ (Feature adoption, Task completion) │
│ Move fast, early signal │
└─────────────────────────────────────────────────────┘
Metric Types:
| Type | Definition | Example | Use For |
|---|---|---|---|
| Leading | Early signal, fast-moving, directly influenced by feature | Feature adoption rate, task completion rate | Weekly decisions, rollout gates |
| Core | Primary outcome you're targeting | Activation rate, conversion rate, engagement score | Bet success criteria |
| Lagging | Business results, slow-moving, influenced by many factors | Revenue, retention, NPS | Quarterly/annual planning |
| Guardrail | Metrics you won't let degrade | Performance, error rate, support tickets | Rollout gates, rollback triggers |
Hierarchy Example (Activation Bet):
Lagging: Revenue growth (quarterly)
↑
Core: Activation rate (weekly)
↑
Leading: Onboarding completion (daily)
First value action (daily)
↑
Guardrail: Support tickets (daily)
Error rate (real-time)
Metric Selection Criteria:
| Criterion | Question |
|---|---|
| Measurable | Can we actually track this? |
| Actionable | Can we influence it with our work? |
| Attributable | Can we connect changes to our bet? |
| Timely | Will we see signal fast enough to decide? |
Dashboard Design:
- Leading metrics: Real-time or daily
- Core metrics: Weekly view with trend
- Lagging metrics: Monthly/quarterly view
- Guardrails: Alerting, not just reporting
4. Bet Retrospectives
The Core Idea: Every bet concludes with an explicit decision: Scale, Iterate, or Kill.
Retrospective Format:
BET RETROSPECTIVE: [Name]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Timebox: [Duration] | Shipped: [Date]
HYPOTHESIS REVIEW
Original: "We believed [X] would result in [Y]"
Result: [ ] Confirmed [ ] Disproved [ ] Inconclusive
METRICS REVIEW
| Metric | Target | Actual | Verdict |
|-----------|--------|--------|---------|
| Primary | [X] | [Y] | ✅ / ❌ |
| Secondary | [X] | [Y] | ✅ / ❌ |
| Guardrail | [X] | [Y] | ✅ / ❌ |
KEY LEARNINGS
• [Learning 1]
• [Learning 2]
• [Learning 3]
DECISION: [ ] SCALE [ ] ITERATE [ ] KILL
NEXT STEPS
• [Action 1]
• [Action 2]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Decision Framework:
| Outcome | Criteria | Action |
|---|---|---|
| SCALE | Primary metric hit, no guardrail issues | Expand rollout, invest more |
| ITERATE | Signal positive but not at target | Refine and re-test (one more cycle) |
| KILL | Hypothesis d |