REPL Scratchpad
A persistent Python REPL session that acts as a scratchpad for coding agents. Compose multi-step
operations in code. Variables persist across turns. Only print() output enters context.
Based on the Recursive Language Models (RLM) approach by Zhang, Kraska, and Khattab — extended with cross-turn persistence via tmux.
The Problem
Coding agents waste context on raw data. Every tool call result — file contents, API responses, query results — lands in the conversation and stays there forever. By turn 30, the model has forgotten why it started.
The REPL scratchpad fixes this: the agent writes code that processes data inside the REPL and
only print()s what matters. The raw data never enters the conversation.
When to Use
- Multi-step data exploration (query -> filter -> analyze -> summarize)
- Tasks requiring 3+ sequential tool calls that could be composed in one code block
- Working with structured data (JSON, CSV, API responses, database results)
- Accumulating state across turns (storing intermediate results for later use)
- Any task where raw tool output would bloat context unnecessarily
When NOT to Use
- Simple single-file reads or edits (use Read/Edit tools directly)
- Git operations (use Bash directly)
- Tasks with no intermediate data to process
Prerequisites
tmuxinstalled and available in PATHpython3installed and available in PATH
Setup
Start the scratchpad session by running:
bash <skill-path>/scripts/setup.sh
Where <skill-path> is the directory where you cloned this skill (e.g., ~/.claude/skills/repl-scratchpad).
This creates a tmux session named scratchpad with a persistent Python interpreter.
Core Workflow
Sending code to the scratchpad
- Write the code to a temp file
- Tell the scratchpad to execute it
- Read only the output
# Step 1: Write code to temp file
cat > /tmp/scratchpad_cmd.py << 'PYEOF'
import json
data = json.loads(open("/path/to/file.json").read())
filtered = [x for x in data if x["status"] == "error"]
print(f"{len(filtered)} errors found")
for item in filtered[:5]:
print(f" - {item['name']}: {item['message']}")
PYEOF
# Step 2: Execute in persistent session
tmux send-keys -t scratchpad "_scratchpad_exec('/tmp/scratchpad_cmd.py')" Enter
# Step 3: Wait and read output file (each execution overwrites cleanly)
sleep 1
cat /tmp/_scratchpad_output.txt
The Print Contract
CRITICAL PRINCIPLE: Only print() output should enter the conversation context.
- DO:
print(f"{len(results)} items found")— summary enters context - DO:
print(json.dumps(summary, indent=2))— structured summary enters context - DO NOT: return raw query results, file contents, or API responses into context
- DO NOT: use the Bash tool to
catlarge files — read them inside the scratchpad instead
The scratchpad processes everything. Context sees only what was explicitly printed.
Variable Persistence
Variables assigned in the scratchpad persist across turns:
# Turn 1
services = load_services()
print(f"Loaded {len(services)} services")
# Turn 2 — 'services' is still here
degraded = [s for s in services if s["error_rate"] > 0.05]
print(f"{len(degraded)} degraded")
# Turn 3 — both 'services' and 'degraded' are still here
for s in degraded:
print(f" {s['name']}: {s['error_rate']:.1%}")
Use this to build up working state incrementally. Store intermediate results in variables instead of dumping them into the conversation.
Composition Pattern
Instead of making individual tool calls:
BAD (3 turns, all output in context):
Turn 1: Bash -> curl API -> full JSON response in context
Turn 2: Bash -> jq filter -> filtered output in context
Turn 3: Bash -> analyze -> analysis in context
Compose in one scratchpad execution:
# GOOD (1 turn, only summary in context):
import urllib.request, json
resp = json.loads(urllib.request.urlopen("http://api/services").read())
broken = [s for s in resp if s["error_rate"] > 0.05]
deps = {s["id"]: get_deps(s["id"]) for s in broken}
print(f"{len(broken)} services degraded:")
for s in broken:
print(f" {s['name']} -> {', '.join(deps[s['id']])}")
Advanced Patterns
Map-Reduce: Fan-Out Processing
Use Python's built-in concurrency to process many files/items in parallel inside the REPL. All processing stays in the scratchpad — only the summary enters context.
import concurrent.futures, os, glob
def analyze_file(path):
with open(path) as f:
lines = f.readlines()
imports = [l for l in lines if l.startswith("import") or l.startswith("from")]
classes = [l for l in lines if l.strip().startswith("class ")]
return {"path": path, "lines": len(lines), "imports": len(imports), "classes": len(classes)}
files = glob.glob("src/**/*.py", recursive=True)
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as pool:
results = list(pool.map(analyze_file, files))
# 200 files processed, zero context used. Only summary prints:
print(f"{len(results)} files: {sum(r['lines'] for r in results):,} lines")
big = sorted(results, key=lambda r: -r['lines'])[:5]
for r in big:
print(f" {r['lines']:>5} lines {r['path']}")
Recursive Drill-Down
Process data in a loop, drilling deeper on each iteration. The REPL accumulates state without re-querying.
# Turn 1: broad scan
import os, glob
all_files = glob.glob("**/*.ts", recursive=True)
by_dir = {}
for f in all_files:
d = os.path.dirname(f)
by_dir.setdefault(d, []).append(f)
print(f"{len(all_files)} files across {len(by_dir)} dirs")
for d in sorted(by_dir, key=lambda d: -len(by_dir[d]))[:5]:
print(f" {len(by_dir[d]):>3} files {d}")
# Turn 2: drill into the largest directory (by_dir still in memory)
target = sorted(by_dir, key=lambda d: -len(by_dir[d]))[0]
details = []
for f in by_dir[target]:
with open(f) as fh:
content = fh.read()
exports = [l for l in content.split('\n') if 'export' in l]
details.append({"file": os.path.basename(f), "lines": content.count('\n'), "exports": len(exports)})
print(f"\n{target}/ deep dive:")
for d in sorted(details, key=lambda x: -x['lines']):
print(f" {d['lines']:>4} lines {d['exports']:>2} exports {d['file']}")
Spawning Subagents for Heavy Lifting
When the REPL hits limits (needs LLM reasoning, must read hundreds of files, or requires framework-specific tools), delegate to subagents. The REPL orchestrates the work and collects results.
Claude Code
Claude Code subagents are spawned via the Agent tool from the main conversation. The scratchpad prepares the work, the main agent fans out subagents, and results flow back.
Pattern: use the REPL to identify what needs processing, then ask the main agent to spawn subagents for each item.
# Step 1: REPL identifies the targets
import glob, os
files = glob.glob("src/**/*.ts", recursive=True)
large = [f for f in files if os.path.getsize(f) > 10000]
print("Files needing deep review:")
for f in large:
print(f" {f} ({os.path.getsize(f)//1000}KB)")
Then tell the main agent:
Spawn parallel subagents to review each of these files:
- src/agent/turn-loop.ts
- src/store/index.ts
- src/server/routes.ts
Each subagent should analyze imports, exports, and complexity.
Claude Code spawns subagents using the Agent tool with these key parameters:
subagent_type: built-in types (Explore,Plan,general-purpose) or custom agentsrun_in_background: settruefor parallel executionmodel:haikufor fast/cheap tasks,sonnet/opusfor complex analysisisolation: "worktree"for subagents that modify files
Custom subagents are defined as markdown files in .claude/agents/ or ~/.claude/agents/.
See Claude Code subagent docs for full reference.
OpenAI Codex
Codex supports multi-agent workflows (ex