AgentOps Operating Model
AgentOps is the operational layer for coding agents.
Publicly, it gives you four things:
- Bookkeeping — captured learnings, findings, and reusable context
- Validation — plan and code review before work ships
- Primitives — single skills, hooks, and CLI surfaces
- Flows — named compositions like
/research,/validation, and/rpi
Technically, AgentOps acts as a context compiler: raw session signal becomes reusable knowledge, compiled prevention, and better next work.
Core Flow: RPI
Research → Plan → Implement → Validate
↑ │
└──── Knowledge Flywheel ────┘
Research Phase
/research <topic> # Deep codebase exploration
ao search "<query>" # Search existing knowledge
ao search "<query>" --cite retrieved # Record adoption when a search result is reused
ao lookup <id> # Pull full content of specific learning
ao lookup --query "x" # Search knowledge by relevance
Output: .agents/research/<topic>.md
Plan Phase
/pre-mortem <spec> # Simulate failures (error/rescue map, scope modes, prediction tracking)
/plan <goal> # Decompose into trackable issues
Output: Beads issues with dependencies
Implement Phase
/implement <issue> # Single issue execution
/crank <epic> # Autonomous epic loop (uses swarm for waves)
/swarm # Parallel execution (fresh context per agent)
Output: Code changes, tests, documentation
Validate Phase
/vibe [target] # Code validation (finding classification + suppression + domain checklists)
/post-mortem # Validation + streak tracking + prediction accuracy + retro history
/retro # Quick-capture a single learning
Output: .agents/learnings/, .agents/patterns/
Phase-to-Skill Mapping
| Phase | Primary Skill | Supporting Skills |
|---|---|---|
| Discovery | /discovery | /brainstorm, /research, /plan, /pre-mortem |
| Implement | /crank | /implement (single issue), /swarm (parallel execution) |
| Validate | /validation | /vibe, /post-mortem, /retro, /forge |
Choosing the skill:
- Use
/implementfor single issue execution. Now defaults to TDD-first — writes failing tests before implementing. Skip with--no-tdd. - Use
/crankfor autonomous epic execution (loops waves via swarm until done). Auto-generates file-ownership maps to prevent worker conflicts. - Use
/discoveryfor the discovery phase only (brainstorm → search → research → plan → pre-mortem). - Use
/validationfor the validation phase only (vibe → post-mortem → retro → forge). - Use
/rpifor full lifecycle — delegates to/discovery→/crank→/validation. - Use
/ratchetto gate/record progress through RPI.
Start Here (12 starters)
These are the skills every user needs first. Everything else is available when you need it.
| Skill | Purpose |
|---|---|
/quickstart | Guided onboarding — run this first |
/bootstrap | One-command full AgentOps setup — fills gaps only |
/research | Deep codebase exploration |
/council | Multi-model consensus review + finding auto-extraction |
/validate | Canonical PASS/WARN/FAIL verdict over an artifact, plan, code change, PR, or gate |
/vibe | Code validation (classification + suppression + domain checklists) |
/rpi | Full RPI lifecycle orchestrator (/discovery → /crank → /validation) |
/implement | Execute single issue |
/retro --quick | Quick-capture a single learning into the flywheel |
/status | Single-screen dashboard of current work and suggested next action |
/goals | Maintain GOALS.yaml fitness specification |
/push | Atomic test-commit-push workflow |
Advanced Skills (when you need them)
| Skill | Purpose |
|---|---|
/compile, /flywheel | Active knowledge intelligence and flywheel health — Mine → Grow → Defrag cycle |
/curate | Canonical miner role for transcripts, .agents/, bd, git, skill diffs, and rare wiki entries |
/llm-wiki | External reading wiki proposal — raw sources to compiled wiki |
/harvest | Cross-rig knowledge consolidation — sweep, dedup, promote to global hub |
/knowledge-activation | Operationalize a mature .agents corpus into beliefs, playbooks, briefings, and gap surfaces |
/brainstorm | Structured idea exploration before planning |
/discovery | Full discovery phase orchestrator (brainstorm → search → research → plan → pre-mortem) |
/plan | Epic decomposition into issues |
/design | Product validation gate — goal alignment, persona fit, competitive differentiation |
/pre-mortem | Failure simulation (error/rescue, scope modes, temporal, predictions) |
/post-mortem | Validation + streak tracking + prediction accuracy + retro history |
/bug-hunt | Root cause analysis |
/release | Pre-flight, changelog, version bumps, tag |
/crank | Autonomous epic loop (uses swarm for each wave) |
/swarm | Fresh-context parallel execution (Ralph pattern) |
/evolve | Goal-driven fitness-scored improvement loop |
/burndown | Bounded epic-completion loop — drive a finite target to all-merged, then stop |
/eval-outcomes | Grade via Outcomes as a holdout-safe projection of the locked eval substrate — one bar, many runtimes |
/operating-loop-workflow | Install + run the operating-loop multi-agent Workflow (seven-move loop) |
/autodev | PROGRAM.md autonomous development contract setup and validation |
/dream | Interactive Dream operator surface for setup, bedtime runs, and morning reports |
/doc | Documentation generation — repo docs (default), gold-standard README (--mode=readme), OSS doc packs (--mode=oss) |
/retro | Quick-capture a learning (full retro → /post-mortem) |
/validation | Full validation phase orchestrator (vibe → post-mortem → retro → forge) |
/ratchet | Brownian Ratchet progress gates for RPI workflow |
/forge | Mine transcripts for knowledge — decisions, learnings, patterns |
/security | Continuous repository security scanning and release gating |
/security-suite | Binary and prompt-surface security suite — static analysis, dynamic tracing, offline redteam, policy gating |
/test | Test generation, coverage analysis, and TDD workflow |
/hooks-authoring | Author and validate AgentOps runtime hooks |
/red-team | Persona-based adversarial validation — probe docs and skills from constrained user perspectives |
/review | Review incoming PRs, agent output, or diffs — SCORED checklist |
/refactor | Safe, verified refactoring with regression testing at each step |
/deps | Dependency audit, update, vulnerability scanning, and license compliance |
/perf | Performance profiling, benchmarking, regression detection, and optimization |
/system-tuning | Restore system responsiveness via safe, ordered process cleanup and agent-swarm hygiene |
/scaffold | Project scaffolding, component generation, and boilerplate setup |
/scenario | Author and manage holdout scenarios for behavioral validation |
/skill-auditor | Two-pass audit of an existing SKILL.md against the unified template (15 checks) |
/skill-builder | Scaffold or absorb new SKILL.md files against the unified template |
/automation-shape-routing | Front door for building agent automation — decide the SHAPE (Workflow vs NTM swarm vs plain skill), then hand off to the right builder |
/workflow-builder | Scaffold a new Claude Workflow script (.claude/workflows/*.js) — deterministic multi-agent orchestration |
/agent-native | Make out-of-session agents AgentOps-native via skills + ao CLI + CI, not hooks |
Expert Skills (specialized workflows)
| Skill | Purpose |
|---|---|
/grafana-platform-dashboard | Build Grafana platf |