Paths: File paths (references/, ../ln-*) are relative to this skill directory.

Type: L2 Coordinator Category: 3XX Planning

Multi-Agent Validator

Evaluation-platform coordinator for:

mode=story
mode=plan_review

This skill uses the evaluation platform for:

mandatory official-doc, MCP Ref, Context7, and current-web research
parallel read-only evidence lanes
sequential documentation, repair, merge, refinement, and approval
runtime-backed worker plans, worker summaries, agent sync, and cleanup verification

Inputs

Input	Required	Source	Description
`storyId`	`mode=story`	args, git branch, kanban, user	Story to validate
`plan {file}`	`mode=plan_review`	args or auto	Plan file to validate

Mode detection:

plan or plan {file} -> mode=plan_review
otherwise -> mode=story

Mandatory Read

MANDATORY READ: Load references/environment_state_contract.md, references/storage_mode_detection.md, references/input_resolution_pattern.md MANDATORY READ: Load references/evaluation_coordinator_runtime_contract.md, references/evaluation_summary_contract.md, references/evaluation_parallelism_policy.md, references/evaluation_research_contract.md MANDATORY READ: Load references/agent_delegation_pattern.md MANDATORY READ: Load references/penalty_points.md MANDATORY READ: Load references/researchgraph_mcp_usage.md when researchgraph files changed or the target claims hypothesis, goal, benchmark, or proposal readiness. Conditional read: load references/phase2_research_audit.md only when the coordinator performs inline criteria mapping instead of consuming ln-312 findings summaries.

Agent review policy: run health check, record skipped reason when no advisor is available, verify every advisor claim before merge, and treat transport/auth/tool failures as operator evidence rather than domain findings. Load references/agent_review_workflow.md only when debugging lifecycle/liveness details outside the evaluation runtime.

Worker Set

The coordinator uses these evaluation workers:

ln-311-review-research-worker
ln-312-review-findings-worker
ln-313-review-docs-worker
ln-314-review-repair-worker
ln-315-review-merge-worker
ln-316-review-refinement-worker

Worker Invocation (MANDATORY)

Host Skill Invocation: Skill(skill: "...", args: "...") is mandatory delegation.

Claude: call the Skill tool exactly as shown.
Codex: if no Skill tool exists, locate the named skill in available skills, read its SKILL.md, treat args as $ARGUMENTS, execute that skill workflow, then return here with its result/artifact.
Do not inline worker logic or mark the worker complete without executing the target skill.

Use the Skill tool for delegated workers. Do not inline worker logic inside the coordinator.

TodoWrite format (mandatory):

Resolve target and build runtime manifest
Load target artifacts and metadata
Launch external agents and verify health
Run research and findings workers in parallel
Generate documentation updates
Apply accepted low-risk repairs
Sync agents and merge all evidence
Run refinement (MANDATORY in ALL modes when advisor available — do NOT skip)
Compute verdict and write review output
Verify runtime cleanup and self-check

Representative invocations:

Skill(skill: "ln-311-review-research-worker", args: "{identifier} research")
Skill(skill: "ln-312-review-findings-worker", args: "{identifier} findings")
Skill(skill: "ln-313-review-docs-worker", args: "{identifier} docs")
Skill(skill: "ln-314-review-repair-worker", args: "{identifier} repair")
Skill(skill: "ln-315-review-merge-worker", args: "{identifier} merge")
Skill(skill: "ln-316-review-refinement-worker", args: "{identifier} refinement")

Runtime Contract

MANDATORY READ: Load references/loop_health_contract.md

Runtime family:

evaluation-runtime

Identifier:

story-{storyId} for story mode
plan-{slug} for plan review

Phase order:

PHASE_0_CONFIG
PHASE_1_DISCOVERY
PHASE_2_AGENT_LAUNCH
PHASE_3_EVIDENCE_LANES
PHASE_4_DOCS
PHASE_5_REPAIR
PHASE_6_MERGE
PHASE_7_REFINEMENT
PHASE_8_APPROVAL
PHASE_9_SELF_CHECK

Phase policy:

delegate_phases = [PHASE_3_EVIDENCE_LANES, PHASE_4_DOCS, PHASE_5_REPAIR, PHASE_6_MERGE, PHASE_7_REFINEMENT]
aggregate_phase = PHASE_6_MERGE
report_phase = PHASE_8_APPROVAL
cleanup_phase = PHASE_9_SELF_CHECK
self_check_phase = PHASE_9_SELF_CHECK
agent_resolve_before = [PHASE_6_MERGE]
required_phases_when_advisor_available = [PHASE_7_REFINEMENT]

Parallelism Rules

Allowed overlap:

external agents
ln-311
ln-312
local repo inspection and evidence gathering

Sequential only:

ln-313
ln-314
ln-315
ln-316
approval and status mutation

Workflow

Phase 0: Config

Resolve mode, identifier, and storage mode.
Resolve story or plan target.
Build evaluation runtime manifest with:
- expected_agents
- required_research=true
- exact phase_order
- phase_policy
- report path
Start runtime:

node references/scripts/evaluation-runtime/cli.mjs start \
  --skill ln-310 \
  --identifier {identifier} \
  --manifest-file .hex-skills/evaluation/{identifier}_manifest.json

Checkpoint Phase 0.

Phase 1: Discovery

Materialize the exact target artifact.
Load only the metadata needed for the current mode.
In mode=story, resolve Story and child tasks.
In mode=plan_review, resolve the plan file.
If researchgraph files changed or the target cites H##, G##, run IDs, benchmark manifests, or readiness claims, run read-only researchgraph verification/audits and attach the result as validation evidence.
Checkpoint Phase 1 with resolved refs.

Phase 2: Agent Launch

Run agent health check.
Exclude disabled agents from .hex-skills/environment_state.json.
If no agents are available:
- record agents_skipped_reason
- checkpoint Phase 2
- continue
Otherwise:
- build per-agent prompts
- launch each available agent
- register each launched agent:

node references/scripts/evaluation-runtime/cli.mjs register-agent \
  --skill ln-310 \
  --identifier {identifier} \
  --agent {name} \
  --prompt-file {promptPath} \
  --result-file {resultPath} \
  --metadata-file {metadataPath}

Checkpoint Phase 2 with health_check_done, agents_available, agents_required, and optional agents_skipped_reason.
Classify each external agent result before domain verdict:
- rate_limited, tool_missing, auth_missing, permission_denial, and asked_question are transport/operator states.
- Do not convert them into NO-GO without domain evidence from artifacts or findings.
- Record loop health for repeated advisor/session failures and pause when retry usefulness is exhausted.

Phase 3: Evidence Lanes

This phase is the mandatory parallel evidence barrier.

Build worker_plan with:
- ln-311 lane research (mandatory)
- ln-312 lane findings (mandatory)
Launch all planned workers in parallel.
While those workers run, continue local repo inspection and collect additional evidence.
Sync agents opportunistically, but do not block on them until merge.
Record each worker summary with:

node references/scripts/evaluation-runtime/cli.mjs record-worker-result \
  --skill ln-310 \
  --identifier {identifier} \
  --payload-file {childSummaryArtifactPath}

Research is mandatory in every mode:

official documentation or standards
MCP Ref
Context7 when a library or framework is involved
current web best-practice research

For mode=story, findings must still produce penalty-point evidence and coverage analysis.

Phase 4: Docs

In mode=story, run ln-313-review-docs-worker when documentatio

ln-310-multi-agent-validator

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes