Wizard: scans codebase, proposes metric/guard/agent config, writes program.md run spec. Also runs cProfile on file path to surface bottlenecks before prompting for optimization goal.
NOT for: running experiments (use /research:run); methodology validation (use /research:judge); full pipeline from goal to result (use /research:sweep).
Agent Resolution
_RESEARCH_SHARED=$(python "${CLAUDE_PLUGIN_ROOT:-plugins/research}/bin/resolve_shared.py" 2>/dev/null) # timeout: 5000
Read $_RESEARCH_SHARED/agent-resolution.md. Contains: foundry check + fallback table. If foundry not installed: use table to substitute each foundry:X with general-purpose. Agents this skill uses: foundry:solution-architect, foundry:perf-optimizer.
Plan Mode (Steps P-P0–P-P4)
<!-- P-P prefix = Plan-mode steps; R-prefix = Run-mode steps; these labels appear in task-tracking instructions -->Triggered by plan <goal|file>. Wizard configures run.
Task tracking: create tasks for P-P0, P-P1, P-P2, P-P2b, P-P3 at start; add P-P4 only if --team detected in arguments.
Unsupported flag check — strip --team from $ARGUMENTS, scan remaining tokens for any --<token>. If found: print ! Unknown flag(s): \--<token>`. Supported: `--team`.then invokeAskUserQuestion` — (a) Abort (stop, re-invoke with correct flags) · (b) Continue ignoring (skip unknown flags, proceed). On Abort: stop.
Step P-P0: Detect input type
Parse <input> from arguments. Determine: file path or goal string:
First, extract first positional token (strip all --<flag> tokens from $ARGUMENTS, take first remaining token as FILE_ARG). Then:
Disambiguation guard — only treat FILE_ARG as a file path if it actually exists on disk. Multi-token strings ($ARGUMENTS containing spaces beyond FILE_ARG) are always goal text — never run test -f on the first token of a multi-token goal:
# Count tokens in $ARGUMENTS (excluding flags). If >1, FILE_ARG is part of a goal string.
NONFLAG_TOKEN_COUNT=$(echo "$ARGUMENTS" | tr ' ' '\n' | grep -v '^--' | grep -v '^$' | wc -l | tr -d ' ') # timeout: 5000
NONFLAG_TOKEN_COUNT == 1ANDtest -f "$FILE_ARG"succeeds → file path.FILE_ARGis the script to profile. Enter profiling flow.- Otherwise (multi-token, or single token not on disk) → goal string. Use full
$ARGUMENTS(minus flags) as<goal>. Skip to Step P-P1.
Profiling flow (file path detected):
Run baseline profiling using FILE_ARG only — never use raw $ARGUMENTS in cProfile command:
CPROFILE_OUT=$(mktemp -t research-plan-XXXX) # timeout: 3000
python -m cProfile -s cumtime "$FILE_ARG" > "$CPROFILE_OUT" 2>&1 # timeout: 600000
PROFILE_EXIT=$?
if [ $PROFILE_EXIT -ne 0 ]; then
echo "⚠ cProfile failed (exit $PROFILE_EXIT) — continuing without profile data."
echo " Wizard will prompt for an optimization goal from the goal-string path instead of bottleneck selection."
PROFILE_AVAILABLE=false
else
PROFILE_AVAILABLE=true
head -40 "$CPROFILE_OUT" # timeout: 5000
time python "$FILE_ARG" # timeout: 600000
fi
Fallback path — if PROFILE_AVAILABLE=false: skip the bottleneck selection menu; invoke AskUserQuestion with a single open-ended option: "Describe the optimization goal for <file> (cProfile data unavailable — provide goal string directly)". Use the user's response as <goal> and proceed to P-P1. Never hard-exit on cProfile failure — degraded mode (goal-string only) is always available.
Profile-available path — when PROFILE_AVAILABLE=true, present top 5 bottleneck functions:
Top bottleneck functions:
1. <function> — <cumtime>s (<percentage>%)
2. <function> — <cumtime>s (<percentage>%)
...
Invoke AskUserQuestion — "What would you like to optimize?", options: (a) Overall execution time · (b) Memory usage · (c) Specific function: <top function name> (currently <time>s) · (d) Custom goal (describe).
Construct goal string from selection:
- (a) →
"Reduce wall-clock execution time of <file>" - (b) →
"Reduce peak memory usage of <file>" - (c) →
"Optimize <function> in <file> (currently <time>s)" - (d) → user's text
Set as <goal>, proceed to P-P1.
Step P-P1: Parse and scan
Scope guard (first action): Before scanning, check <goal> is optimization goal. Input clearly not optimization goal (code question, regex/algo explanation, debug question, any prompt without measurable improvement target) → invoke AskUserQuestion:
- question: "This input does not look like an optimization goal (
/research:planexpects 'Reduce X' / 'Increase Y' / 'Improve Z metric'). How to proceed?" - (a) label:
rephrase as optimization goal— description: provide revised goal with measurable improvement target - (b) label:
abort— description: stop; use/researchfor explanatory questions
Stop if user selects (b). Do not proceed to P-P2 or P-P3 without valid optimization goal.
Parse <goal>. Scan codebase to detect:
- Language and framework (Python, PyTorch, pytest, etc.)
- Available test runners or benchmark scripts
- Candidate metric commands (pytest coverage, benchmark scripts, eval scripts)
- Candidate guard commands (test suite, lint, type check)
- Files relevant to goal (scope files)
Step P-P2: Present proposed config
Present config as code block for review. Include:
metric_cmd: [command that prints a single numeric result]
metric_direction: higher | lower
guard_cmd: [command that must pass (exit 0) on every kept commit]
max_iterations: [default 20]
agent_strategy: [auto | perf | code | ml | arch]
scope_files: [files the ideation agent may modify]
compute: local | colab | docker
Dry-run both commands before presenting (add # timeout: 60000 to timed bash calls — user commands may run for minutes). Failure → flag error, propose corrections. Do not proceed to P-P3 until user confirms or edits.
Step P-P2b: Agent validation (pre-write)
After user confirms, run expert agent review before writing program.md. Dispatch conditional on goal type — run whichever apply in parallel.
Foundry availability check — before dispatching any foundry:* agent:
_FOUNDRY_AVAILABLE=$(find ~/.claude/plugins/cache -path "*/foundry*" -name "solution-architect.md" 2>/dev/null | head -1) # timeout: 5000
If _FOUNDRY_AVAILABLE empty: skip architecture and perf reviews entirely; print ⚠ foundry plugin not installed — skipping foundry:solution-architect and foundry:perf-optimizer reviews. Continuing without architecture/perf advisory.; record gap in advisory block as architect: skipped (foundry absent). Proceed to P-P3 with available advisor output (scientist only if ML keywords matched).
Pre-spawn — create plan run dir (review files share single timestamped dir):
PLAN_RUN_DIR=$(python "${CLAUDE_PLUGIN_ROOT:-plugins/research}/bin/make_run_dir.py" "plan" ".experiments" 2>/dev/null) # timeout: 5000
Health monitoring (CLAUDE.md §6) — create one checkpoint per parallel agent so individual stalls are detectable (ADV-H16). Without per-agent checkpoints a single live agent masks two stalled ones:
# Plan-mode health constants — ADV-L19 (constants YAML not auto-exported to bash)
PLAN_TIMEOUT_SEC="${PLAN_TIMEOUT_SEC:-600}"
PLAN_MONITOR_INTERVAL="${PLAN_MONITOR_INTERVAL:-300}"
PLAN_HARD_CUTOFF="${PLAN_HARD_CUTOFF:-900}"
# Per-agent checkpoints — TMPDIR-relative, timestamp-suffixed to avoid collisions
_TS=$(date +%s)
ARCH_CHECK="${TMPDIR:-/tmp}/research-plan-arch-${_TS}"
SCI_CHECK="${TMPDIR:-/tmp}/research-plan-sci-${_TS}"
PERF_CHECK="${TMPDIR:-/tmp}/research-plan-perf-${_TS}"
touch "$ARCH_CHECK" "$SCI_CHECK" "$PERF_CHECK" # timeout: 3000
# Helper checkpoints from health_monitor_start.py — retained for LAUNCH_AT bookkeeping
_HM=$(python "${CLAUDE_P