Handoff Trigger Check
Evaluate whether work-in-design is ready to hand off to autonomous execution. Runs a 7-condition gate, applies the active profile's pass threshold, and returns a structured verdict. The single most important check before letting an agent runtime — Claude Code, an n8n workflow, an OpenClaw instance, or any other autonomous executor — take over from human-led design.
The premise: the costliest mistake in human/agent collaboration is holding work in human hands after it's ready to be executed. A documented production case shipped 27 similar instances in under 24 hours of autonomous Claude Code time after a week of human-led ground-truth setup on the first instance; the week was necessary, but the cost was holding instances 2-27 in human hands for any time at all after the engine passed verification on instance 1. This Skill exists to make that handoff moment a measurable trigger, not a vibes call.
Important
Profile-agnostic by design. The Skill takes a profile config as input. The profile sets the pass threshold per condition (e.g., a "sleeping" profile may require 95% confidence on all 7 conditions because human-in-the-loop is unavailable; a "desk" profile may accept 60% because corrections happen in real time). No human's name, schedule, or specific availability appears in the Skill itself. Profiles are configuration data shipped separately.
Token budget is one of the seven conditions, not an afterthought. Autonomous runs that exceed budget mid-execution leave half-finished work and degrade trust. Budget headroom is checked as a precondition. Where token telemetry is unavailable, the profile may declare an estimated token budget that the requestor must input.
Output is structured JSON, not prose. The Skill emits a parseable verdict that an orchestrator can act on without an LLM round-trip. Human-readable summary is provided alongside but is secondary.
The Skill answers a binary question, not a graduated one. Either the work is ready to hand off or it is not. When not, the verdict names the specific blocker(s) so the design conversation can address exactly those gaps and re-check.
Deployment target. This Skill runs on the chat-Project (CP) side of the design→execution boundary — typically inside a Claude Project where design conversations happen. It is the literal mechanism that decides when work moves from CP to autonomous execution; placing it on the CC side defeats its purpose.
Invocation Modes
This Skill supports three modes of invocation. All three produce the same structured JSON verdict against the same 7 conditions. They differ only in how work_context is gathered and how the result is delivered.
Mode 1 — Deliberate Gate
Trigger: Caller provides a fully-formed work_context payload — typically because an orchestrator, scheduled job, or deliberate user action invoked the Skill with structured input ready for evaluation.
Behavior: Mechanical. Skill receives profile + work_context, evaluates the 7 conditions against provided evidence, applies the profile's threshold rule, returns JSON verdict. No conversational scaffolding.
Output: Structured JSON only (or JSON + a brief human-readable summary when verbose: true).
When this mode fits: Orchestrator-driven invocation; CI/CD or scheduled workflow; user has done the prep work and just wants the verdict; programmatic callers that will parse the JSON and act on it.
This is the v1.0 behavior and remains byte-identical for backward compatibility. Existing v1.0 callers do not need to change anything.
Mode 2 — Proactive Sensing
Trigger: Claude is participating in a design conversation (typically a chat Project, not an autonomous Claude Code session) and detects signals that the work being discussed is approaching handoff readiness. See "Proactive Sensing Triggers" below for the signal list. The Skill is not invoked by name — Claude initiates the offer when signals warrant.
Behavior: Claude offers the check rather than running it silently:
"I'm noticing [specific signal X, specific signal Y]. The handoff-gate is the formal mechanism for evaluating whether this work is ready to move to autonomous execution. Want me to run it? I can either gather the evidence from our conversation or walk through the 7 conditions with you one at a time."
If the user accepts, Claude either:
- (a) gathers
work_contextfrom the existing conversation history and runs Mode 1 mechanics — recommended when the conversation already contains the evidence, or - (b) shifts into Mode 3 (conversational walkthrough) — recommended when the user prefers to talk through it.
If the user declines, Claude notes the offer was declined and continues the design conversation. Do not re-offer in the same turn; re-offer only on a new triggering signal cluster. See references/sensing-triggers-detailed.md for the full re-offer discipline.
Output: Same structured JSON verdict as Mode 1, plus a brief conversational framing explaining why the offer was made and how the verdict reads against the work just discussed.
When this mode fits: Chat-interface design conversations where the human is not yet thinking about handoff but the work is ready. The Skill's job here is to surface the readiness moment — not to wait to be asked.
Mode 3 — Conversational Walkthrough
Trigger: User explicitly requests a walkthrough ("walk me through readiness," "talk me through the handoff check," "let's evaluate whether we're ready"), OR Mode 2 was accepted with a preference to discuss rather than auto-evaluate, OR a user new to the Skill wants to understand each condition.
Behavior: Claude walks the 7 conditions as discussion topics, one at a time, in order:
- Frame the condition (one sentence — what it's checking, why it matters).
- Ask what evidence supports it for this work.
- Capture the evidence (or note its absence).
- Mark pass/fail and move to the next condition.
After all 7 are walked, produce both:
- The structured JSON verdict (so the result is parseable and loggable).
- A conversational summary (3-6 sentences) naming the verdict, the most important blockers if any, and the recommended next steps.
Output: Structured JSON + conversational summary (always — verbose: true is implicit in this mode).
When this mode fits: Pedagogically valuable — first time a user is using the Skill on a given class of work, novel work where evidence is uncertain, or any case where the user wants to understand why the verdict is what it is rather than just receive it. Also the right mode when the design conversation hasn't yet produced clean evidence and a walk-through is what generates it.
Proactive Sensing Triggers
Mode 2 fires when Claude detects two or more signals from distinct categories in a design conversation. A single weak signal is not sufficient; the cluster-of-two rule prevents false positives from casual mentions or vocabulary use.
| Signal | Pattern in conversation |
|---|---|
| Decomposition language | "There are N more like this," "this pattern repeats," "now we just need to do it for each X" |
| Pump-primer language | "We did the first one manually," "instance #1 shipped," "the first run is done" |
| Copy-paste loops emerging | User describes manually doing the same operation across multiple files / instances / records |
| Multi-file / multi-instance edits being discussed | "Do this for all 27 of them," "across the whole batch," "for each vendor code" |
| Repetition signal | "I keep doing this," "I do this every week," "this is the third time" |
| Explicit CC mentions | User mentions Claude Code, autonomous run, overnight execution, agentic execution, or unattended work |
| Extended design mode with concrete spec | Long design conversation has produced a stable spec; verification surface is implicit or named |
| Impatience signals | "Let's just run it," "I want to set this and walk away," "can |