Claw Compactor — OpenClaw Skill Reference
Overview
Claw Compactor reduces token usage across the full OpenClaw workspace using 6 compression layers:
| Layer | Name | Cost | Notes |
|---|---|---|---|
| 1 | Rule Engine | Free | Dedup, strip filler, merge sections |
| 2 | Dictionary Encoding | Free | Auto-codebook, $XX substitution |
| 3 | Observation Compression | Free | Session JSONL → structured summaries |
| 4 | RLE Patterns | Free | Path/IP/enum shorthand |
| 5 | Compressed Context Protocol | Free | Format abbreviations |
| 6 | Engram | LLM API | Real-time Observational Memory |
Skill location: skills/claw-compactor/
Entry point: scripts/mem_compress.py
Engram CLI: scripts/engram_cli.py
Auto Mode (Recommended — Run at Session Start)
python3 skills/claw-compactor/scripts/mem_compress.py <workspace> auto
Automatically compresses all workspace files, tracks token counts between runs, and reports savings. Run this at the start of every session.
Core Commands
Full Pipeline (All Layers)
python3 scripts/mem_compress.py <workspace> full
Runs all 5 deterministic layers in optimal order. Typical: 50%+ combined savings.
Benchmark (Non-Destructive)
python3 scripts/mem_compress.py <workspace> benchmark
# JSON output:
python3 scripts/mem_compress.py <workspace> benchmark --json
Dry-run report showing potential savings without writing any files.
Individual Layers
# Layer 1: Rule-based compression
python3 scripts/mem_compress.py <workspace> compress
# Layer 2: Dictionary encoding
python3 scripts/mem_compress.py <workspace> dict
# Layer 3: Observation compression (session JSONL → summaries)
python3 scripts/mem_compress.py <workspace> observe
# Layer 4: RLE pattern encoding (runs inside `compress`)
# Layer 5: Tokenizer optimization
python3 scripts/mem_compress.py <workspace> optimize
# Tiered summaries (L0/L1/L2)
python3 scripts/mem_compress.py <workspace> tiers
# Cross-file deduplication
python3 scripts/mem_compress.py <workspace> dedup
# Token count report
python3 scripts/mem_compress.py <workspace> estimate
# Workspace health check
python3 scripts/mem_compress.py <workspace> audit
Global Options
--json Machine-readable JSON output
--dry-run Preview without writing files
--since DATE Filter sessions by date (YYYY-MM-DD)
--auto-merge Auto-merge duplicates (dedup command)
Engram — Layer 6: Real-Time Observational Memory
Engram is the flagship layer. It operates as a live engine alongside conversations, automatically compressing messages into structured, priority-annotated knowledge.
Prerequisites
Configure via engram.yaml (recommended) or environment variables:
# engram.yaml — place in claw-compactor root
llm:
provider: openai-compatible
base_url: http://localhost:8403
model: claude-code/sonnet
max_tokens: 4096
threads:
default:
observer_threshold: 30000 # pending tokens before Observer fires
reflector_threshold: 40000 # observation tokens before Reflector fires
concurrency:
max_workers: 4 # parallel thread workers
# Alternative: environment variables
export ANTHROPIC_API_KEY=sk-ant-... # Preferred
# or
export OPENAI_API_KEY=sk-... # OpenAI-compatible fallback
export OPENAI_BASE_URL=https://... # Optional: custom endpoint (local LLM, etc.)
Engram Auto-Mode (Recommended for Production)
Auto-detects all active threads and processes them concurrently (4 workers):
# Single run — auto-detects all threads
python3 scripts/engram_auto.py --workspace ~/.openclaw/workspace
# Via shell wrapper
bash scripts/engram-auto.sh
# Via CLI
python3 scripts/engram_cli.py <workspace> auto --config engram.yaml
python3 scripts/engram_cli.py <workspace> status --thread openclaw-main
python3 scripts/engram_cli.py <workspace> observe --thread openclaw-main
python3 scripts/engram_cli.py <workspace> reflect --thread openclaw-main
Retry: LLM calls retry on 429/5xx with exponential backoff (2s→4s→8s, max 3 attempts). No retry on 400/401/403 (fail fast on config errors).
Engram via Unified Entry Point
# Check all thread statuses
python3 scripts/mem_compress.py <workspace> engram status
# Force Observer for a thread
python3 scripts/mem_compress.py <workspace> engram observe --thread <thread-id>
# Force Reflector for a thread
python3 scripts/mem_compress.py <workspace> engram reflect --thread <thread-id>
# Print injectable context
python3 scripts/mem_compress.py <workspace> engram context --thread <thread-id>
Engram via Dedicated CLI
# Status: all threads
python3 scripts/engram_cli.py <workspace> status
# Status: single thread
python3 scripts/engram_cli.py <workspace> status --thread <thread-id>
# Force observe
python3 scripts/engram_cli.py <workspace> observe --thread <thread-id>
# Force reflect
python3 scripts/engram_cli.py <workspace> reflect --thread <thread-id>
# Import conversation from file (JSON array or JSONL)
python3 scripts/engram_cli.py <workspace> ingest \
--thread <thread-id> --input /path/to/conversation.jsonl
# Get injectable context string (ready for system prompt)
python3 scripts/engram_cli.py <workspace> context --thread <thread-id>
# JSON output for any command
python3 scripts/engram_cli.py <workspace> status --json
python3 scripts/engram_cli.py <workspace> context --thread <id> --json
Engram Daemon Mode (Real-Time Streaming)
# Start daemon, pipe JSONL messages via stdin
python3 scripts/engram_cli.py <workspace> daemon --thread <thread-id>
# Pipe a message:
echo '{"role":"user","content":"Hello!","timestamp":"12:00"}' | \
python3 scripts/engram_cli.py <workspace> daemon --thread <thread-id>
# Control commands (send as JSONL):
echo '{"__cmd":"observe"}' # force observe now
echo '{"__cmd":"reflect"}' # force reflect now
echo '{"__cmd":"status"}' # print thread status JSON
echo '{"__cmd":"quit"}' # exit daemon
# Quiet mode (suppress startup messages on stderr)
python3 scripts/engram_cli.py <workspace> daemon --thread <id> --quiet
Engram Python API
from scripts.lib.engram import EngramEngine
engine = EngramEngine(
workspace_path="/path/to/workspace",
observer_threshold=30_000, # tokens before auto-observe
reflector_threshold=40_000, # tokens before auto-reflect
anthropic_api_key="sk-ant-...", # or set ANTHROPIC_API_KEY env
)
# Add a message — auto-triggers observe/reflect when thresholds exceeded
status = engine.add_message("thread-id", role="user", content="Hello!")
# Returns: {"observed": bool, "reflected": bool, "pending_tokens": int, ...}
# Manual trigger regardless of thresholds
obs_text = engine.observe("thread-id") # returns None if no pending msgs
ref_text = engine.reflect("thread-id") # returns None if no observations
# Get full context dict
ctx = engine.get_context("thread-id")
# Returns: {"thread_id", "observations", "reflection", "recent_messages", "stats", "meta"}
# Build injectable system context string
ctx_str = engine.build_system_context("thread-id")
# Ready to prepend to system prompt
Engram Configuration Variables
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY | — | Anthropic API key (preferred) |
OPENAI_API_KEY | — | OpenAI-compatible API key |
OPENAI_BASE_URL | https://api.openai.com | Custom endpoint for local LLMs |
OM_OBSERVER_THRESHOLD | 30000 | Pending tokens before auto-observe |
OM_REFLECTOR_THRESHOLD | 40000 | Observation tokens before auto-reflect |
OM_MODEL | claude-opus-4-5 | LLM model override |
Threshold Tuning Quick Reference
Each Observer call ≈ 2K output tokens (Sonnet). Daily volume at default 30K threshold:
| Channel | Daily Tokens | @30K threshold | @10K threshold |
|---|