Simmer

Iterative refinement loop — take an artifact (single file or workspace) and hone it repeatedly against user-defined criteria until it's as good as it can get.

Related skills (test-kitchen family):

test-kitchen:omakase-off — don't know what you want → parallel designs → react → pick
test-kitchen:cookoff — know what you want, it's code → parallel implementations → fixed criteria → steal the best
simmer — know what you want, it's anything → user-defined criteria → iterate until good

Flow

"Simmer this" / "Refine this" / "Optimize this pipeline"
    ↓
┌─────────────────────────────────────┐
│  SETUP (identify + criteria)        │
│  Load simmer-setup subskill         │
│                                     │
│  Output: artifact, rubric, N iters, │
│  evaluator (optional),              │
│  background (optional)              │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  LOOP (default 3 iterations)        │
│                                     │
│  Each iteration:                    │
│  1. Dispatch generator subagent     │
│  2. Run evaluator (if present)      │
│  3. Dispatch judge subagent         │
│  4. Load reflect subskill           │
│                                     │
│  Generator gets: candidate + ASI    │
│           + background              │
│  Judge gets: candidate + rubric     │
│       + evaluator output (if any)   │
│  Reflect gets: full score history   │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  OUTPUT                             │
│  Best candidate → result file       │
│  Score trajectory displayed         │
└─────────────────────────────────────┘

When to Use

Trigger when user wants iterative refinement of any kind:

"Simmer this", "refine this", "hone this", "iterate on this"
"Make this better", "improve this over a few rounds"
"Polish this", "tighten this up"
"Optimize this pipeline", "find the best model for this task"
"Tune this configuration", "improve these prompts against this test suite"
Any request to iteratively improve an artifact or workspace

Judge mode is auto-selected by setup based on problem complexity:

Condition	JUDGE_MODE
text/creative, ≤2 criteria, short artifact (email, tweet, tagline)	`single`
text/creative, 3 criteria or long/complex artifact	`board`
code/testable (any)	`board`
pipeline/engineering (any)	`board`
User says "with a single judge"	`single` (override)
User says "with a judge board" or "with a panel"	`board` (override)

Plateau upgrade: If the loop started with a single judge and detects a plateau (3 iterations without improvement), offer: "Scores have plateaued. Switch to judge board for deeper diagnosis?" If the user accepts, switch to JUDGE_MODE: board for remaining iterations.

Not simmer: If the artifact is code and the user wants parallel implementations, use cookoff instead.

Orchestration

Announce: "I'm using the simmer skill to set up iterative refinement."

Track progress (TodoWrite if available, otherwise inline):

Setup — identify artifact, elicit criteria, determine evaluation method
Refinement loop (N iterations)
Output best version with score trajectory

Phase 1: Setup

Invoke simmer:simmer-setup.

Do not attempt to identify the artifact or ask about criteria yourself — that is the setup subskill's job.

Shortcut: If the user (or calling system) has already provided artifact, criteria (each with at least one sentence describing what a high score looks like), iteration count, mode, and optionally evaluator/background, skip the setup subskill entirely. Construct the setup brief directly and proceed to Phase 2.

Setup returns a brief:

ARTIFACT: [content, file path, or directory path]
ARTIFACT_TYPE: [single-file | workspace]
CRITERIA:
  - [criterion 1]: [what better looks like]
  - [criterion 2]: [what better looks like]
  - [criterion 3]: [what better looks like]
PRIMARY: [criterion name — omit if equally weighted]
EVALUATOR: [command to run — omit for judge-only mode]
BACKGROUND: [constraints, available resources, domain knowledge — omit if not needed]
OUTPUT_CONTRACT: [valid output format description — omit for text/creative]
VALIDATION_COMMAND: [quick check command — omit if no cheap validation exists]
SEARCH_SPACE: [what's in scope to explore — omit if unconstrained]
JUDGE_MODE: [single | board — auto-selected by setup based on complexity. User can override]
JUDGE_PANEL: [optional custom judge definitions — omit to use defaults for problem class]
ITERATIONS: [N]
MODE: [seedless | from-file | from-paste | from-workspace]
OUTPUT_DIR: [path, default: docs/simmer]

Phase 2: Refinement Loop

For single-file mode:

mkdir -p {OUTPUT_DIR}

For workspace mode:

# Create initial commit to snapshot the seed state
cd {ARTIFACT}
git add -A && git commit -m "simmer: iteration 0 — seed state"

Iteration counting:

"N iterations" means N generate-judge-reflect cycles AFTER the initial seed judgment. The seed judgment is iteration 0 (not counted toward N). So ITERATIONS: 3 means:

Iteration 0: Judge the seed (no generator)
Iteration 1: Generate → Judge → Reflect
Iteration 2: Generate → Judge → Reflect
Iteration 3: Generate → Judge → Reflect
Total: 3 generation passes + 1 seed judgment = 4 judge rounds

For seedless mode: iteration 1 generates the initial candidate AND judges it. ITERATIONS: 3 means 3 generation passes total.

Iteration 0 (seed):

Single-file mode:

Write the seed artifact to {OUTPUT_DIR}/iteration-0-candidate.md
If seedless: dispatch generator subagent to produce initial candidate from description + criteria, then judge it
If from-file or from-paste: the seed IS the starting artifact — judge it directly (no generator)

Workspace mode:

The seed is the current state of the workspace directory
If from-workspace: judge the current state directly (no generator)
If seedless: dispatch generator to scaffold the initial workspace, then judge it

Each iteration:

Step 1: Generator (subagent)

Invoke simmer:simmer-generator as a subagent.

Single-file subagent prompt:

You are the generator in a simmer refinement loop.

Invoke the skill: simmer:simmer-generator

ITERATION: [N]
ARTIFACT_TYPE: single-file
CRITERIA:
[rubric from setup]

CURRENT CANDIDATE:
[full text of current best candidate]

JUDGE FEEDBACK (ASI from previous round):
[ASI text, or "First iteration — generate initial candidate" if seedless iteration 1]

Write your improved candidate to: {OUTPUT_DIR}/iteration-[N]-candidate.md
(or appropriate extension matching artifact type)

Report: what specifically changed and why (2-3 sentences).

Workspace subagent prompt:

You are the generator in a simmer refinement loop.

Invoke the skill: simmer:simmer-generator

ITERATION: [N]
ARTIFACT_TYPE: workspace
WORKSPACE: [directory path]
CRITERIA:
[rubric from setup]

BACKGROUND:
[constraints, available resources, domain knowledge from setup]

OUTPUT_CONTRACT:
[valid output format — omit if not specified in setup]

VALIDATION_COMMAND:
[quick check command — omit if not specified in setup]

SEARCH_SPACE:
[what's in scope to explore — omit if not specified in setup]

JUDGE FEEDBACK (ASI from previous round):
[ASI text — may describe coordinated changes across multiple files]

EXPLORATION STATUS:
[from reflect: what's been tried vs untried — omit on iteration 1 or if no search space]

Make your changes directly in the workspace directory.
You may edit multiple files in a single iteration when the ASI calls for coordinated changes.
If making infrastructure changes, run VALIDATION_COMMAND (if available) before reporting success.

Report: what specifically changed and why (2-3 sentences).

Step 2: Run Evaluator (if present)

If the setup brief includes an EVALUATOR command: ``

simmer

How to add

Drop this on your repo README

Related skills

pdf

pptx

docx

canvas-design

Get new Documentos skills every Monday