Host: Codex CLI — This skill was designed for Claude Code and adapted for Codex. Cross-reference commands use installed skill names in Codex rather than
/octo:*slash commands. Use the active Codex shell and subagent tools. Do not claim a provider, model, or host subagent is available until the current session exposes it. For host tool equivalents, seeskills/blocks/codex-host-adapter.md.
Execution Contract (MANDATORY - CANNOT SKIP)
This generated Codex skill preserves an enforced workflow contract from the source skill.
PROHIBITED:
- Do not summarize, simulate, or skip the referenced workflow command when this skill requires execution.
- Do not claim provider output or validation artifacts exist without checking the actual files or command output.
- Do not continue silently when a required provider, command, or host capability is unavailable; report the unavailable dependency and use a supported fallback.
Two-Stage Review Pipeline
Separates spec compliance (did you build the right thing?) from code quality (did you build it right?). Stage 1 must pass before Stage 2 runs.
Stage 1: Spec Compliance
Validates the implementation against the intent contract.
Step 1: Load Intent Contract
INTENT_FILE=".claude/session-intent.md"
if [[ -f "$INTENT_FILE" ]]; then
echo "Intent contract found: $INTENT_FILE"
cat "$INTENT_FILE"
else
echo "WARNING: No intent contract found at $INTENT_FILE"
echo "Skipping Stage 1 — proceeding to Stage 2 (code quality) only."
fi
If no intent contract exists: Warn the user and skip to Stage 2. Do NOT fabricate success criteria — the contract must exist from a prior workflow.
Step 2: Validate Success Criteria
For each success criterion in the intent contract:
- Read the criterion from the
## Success Criteriasection - Find evidence in the codebase that the criterion is met
- Mark status:
[PASS]— Evidence confirms criterion is met[FAIL]— Evidence shows criterion is NOT met[PARTIAL]— Partially met, gaps identified
Present results:
## Stage 1: Spec Compliance
### Success Criteria Check
#### Good Enough Criteria
- [PASS] Criterion 1: <how it was met>
- [FAIL] Criterion 2: <why not met, what's missing>
#### Exceptional Criteria
- [PARTIAL] Criterion 1: <what's done, what's remaining>
Step 3: Validate Boundaries
For each boundary in the intent contract:
- Read the boundary from the
## Boundariessection - Check for violations in the implementation
- Mark status:
[RESPECTED]— No violations found[VIOLATED]— Implementation crosses the boundary
### Boundary Check
- [RESPECTED] Boundary 1: <confirmation>
- [VIOLATED] Boundary 2: <what violated it>
Step 4: Stage 1 Gate
| Result | Action |
|---|---|
| All criteria PASS + all boundaries RESPECTED | Proceed to Stage 2 |
| Any criterion FAIL | Report failures. Ask user: fix now or proceed anyway? |
| Any boundary VIOLATED | Report violations. Ask user: fix now or proceed anyway? |
If user chooses to fix: Stop review, list specific fixes needed. If user chooses to proceed: Note the overrides and continue to Stage 2.
Stage 2: Code Quality
Runs stub detection and full code quality review.
Step 1: Stub Detection
Run 5 checks on all changed files:
# Get changed files
if git diff --cached --name-only 2>/dev/null | head -1 > /dev/null; then
changed_files=$(git diff --cached --name-only)
elif git diff --name-only HEAD~1..HEAD 2>/dev/null | head -1 > /dev/null; then
changed_files=$(git diff --name-only HEAD~1..HEAD)
else
changed_files=$(git diff --name-only)
fi
# Filter source files
source_files=$(echo "$changed_files" | grep -E "\.(ts|tsx|js|jsx|py|go|rs|sh)$" || true)
STUB_ISSUES=0
for file in $source_files; do
[[ -f "$file" ]] || continue
# Check 1: TODO/FIXME/PLACEHOLDER markers
todo_count=$(grep -cE "(TODO|FIXME|PLACEHOLDER|XXX)" "$file" 2>/dev/null || echo "0")
if [[ "$todo_count" -gt 0 ]]; then
echo "WARNING: $file has $todo_count TODO/FIXME markers"
STUB_ISSUES=$((STUB_ISSUES + 1))
fi
# Check 2: Empty function bodies
empty_fn=$(grep -cE "function.*\{\s*\}|=>\s*\{\s*\}" "$file" 2>/dev/null || echo "0")
if [[ "$empty_fn" -gt 0 ]]; then
echo "ERROR: $file has $empty_fn empty functions"
STUB_ISSUES=$((STUB_ISSUES + 1))
fi
# Check 3: Suspicious null/undefined returns
null_ret=$(grep -cE "return (null|undefined);" "$file" 2>/dev/null || echo "0")
if [[ "$null_ret" -gt 0 ]]; then
echo "WARNING: $file has $null_ret null/undefined returns — verify intentional"
STUB_ISSUES=$((STUB_ISSUES + 1))
fi
# Check 4: Substantive line count
subst_lines=$(grep -cvE "^\s*(//|/\*|\*|#|import|export|$)" "$file" 2>/dev/null || echo "0")
if [[ "$subst_lines" -lt 5 ]]; then
echo "WARNING: $file has only $subst_lines substantive lines"
STUB_ISSUES=$((STUB_ISSUES + 1))
fi
# Check 5: Mock/test data in production code
mock_count=$(grep -cE "const.*(mock|test|dummy|fake).*=" "$file" 2>/dev/null || echo "0")
if [[ "$mock_count" -gt 0 ]]; then
echo "WARNING: $file has $mock_count mock/test data references"
STUB_ISSUES=$((STUB_ISSUES + 1))
fi
done
echo "Stub detection complete: $STUB_ISSUES issues found"
Step 2: Multi-LLM Quality Review (RECOMMENDED)
After stub detection, dispatch code to multiple providers for parallel quality review. A Claude-only review pipeline misses what external models catch — Codex excels at logic errors and correctness, Gemini excels at security and edge case analysis. Using both produces higher-confidence findings.
Check provider availability and dispatch in parallel:
# Get the diff for review
DIFF_CONTENT=$(git diff --cached 2>/dev/null || git diff HEAD~1..HEAD 2>/dev/null || git diff)
If external providers are available — dispatch focused reviews through Octopus routing:
providers=()
command -v codex >/dev/null 2>&1 && providers+=(codex)
command -v agy >/dev/null 2>&1 && providers+=(agy)
command -v gemini >/dev/null 2>&1 && providers+=(gemini)
for provider in "${providers[@]}"; do
safe_provider=$(printf '%s' "$provider" | tr -c '[:alnum:]_-' '_')
"${HOME}/.claude-octopus/plugin/scripts/orchestrate.sh" spawn "$provider" \
"Review this code diff for LOGIC, CORRECTNESS, and SECURITY issues. Focus on:
1. Logic bugs and off-by-one errors
2. Unhandled error paths
3. Race conditions or concurrency issues
4. Incorrect type handling or implicit coercions
5. Security issues at trust boundaries
Report ONLY high-confidence issues. Do NOT flag style preferences.
DIFF:
${DIFF_CONTENT}" > "/tmp/octopus-review-${safe_provider}.md" 2>/dev/null &
done
Wait for external reviews to complete, then synthesize all findings from Claude plus available external providers into a unified quality assessment. If external providers are unavailable, fall back to the Claude-only review below.
Claude (you) performs the full quality review regardless:
- Architecture alignment — Does the code follow project patterns?
- Error handling — Are errors caught and handled appropriately?
- Security — Any OWASP Top 10 issues?
- Performance — Any obvious bottlenecks or N+1 queries?
- Readability — Clear naming, reasonable complexity?
- Test coverage — Are new behaviors tested?
Synthesize external findings: If external providers returned results, merge their findings with yours. Where providers disagree on severity, note the divergence. Where multiple providers flag the same issue, mark it as high-confidence.
Step 3: Present Stage 2 Results
## Stage 2: Code Quality
### Stub Detection
- Files scanned: N
- Issues found: N
- [Details of each issue]
### Quality Review
- Architecture: [PASS/WARN/FAIL]
- Error Handling: [PASS/WARN/FAIL]
- Security: [PASS/WARN/FAIL]
- Performance: [PASS/WARN/FA