CI/CD Quality & Debugging Loop (Loop 3)
Purpose: Continuous integration with automated failure recovery and authentic quality validation.
SOP Workflow: Specification → Research → Planning → Execution → Knowledge
Output: 100% test success rate with authentic quality improvements and failure pattern analysis
Integration: This is Loop 3 of 3. Receives from parallel-swarm-implementation (Loop 2), feeds failure data back to research-driven-planning (Loop 1).
Version: 2.0.0 Optimization: Evidence-based prompting with explicit agent SOPs
When to Use This Skill
Activate this skill when:
- Have complete implementation from Loop 2 (parallel-swarm-implementation)
- Need CI/CD pipeline automation with intelligent recovery
- Require root cause analysis for test failures
- Want automated repair with connascence-aware fixes
- Need validation of authentic quality (no theater)
- Generating failure patterns for Loop 1 feedback
DO NOT use this skill for:
- Initial development (use Loop 2 first)
- Manual debugging without CI/CD integration
- Quality checks during development (use Loop 2 theater detection)
Input/Output Contracts
Input Requirements
input:
loop2_delivery_package:
location: .claude/.artifacts/loop2-delivery-package.json
schema:
implementation: object (complete codebase)
tests: object (test suite)
theater_baseline: object (theater metrics from Loop 2)
integration_points: array[string]
validation:
- Must exist and be valid JSON
- Must include theater_baseline for differential analysis
ci_cd_failures:
source: GitHub Actions workflow runs
format: JSON array of failure objects
required_fields: [file, line, column, testName, errorMessage, runId]
github_credentials:
required: gh CLI authenticated
check: gh auth status
Output Guarantees
output:
test_success_rate: 100% (guaranteed)
quality_validation:
theater_audit: PASSED (no false improvements)
sandbox_validation: 100% test pass
differential_analysis: improvement metrics
failure_patterns:
location: .claude/.artifacts/loop3-failure-patterns.json
feeds_to: Loop 1 (next iteration)
schema:
patterns: array[failure_pattern]
recommendations: object (planning/architecture/testing)
delivery_package:
location: .claude/.artifacts/loop3-delivery-package.json
contains:
- quality metrics (test success, failures fixed)
- analysis data (root causes, connascence context)
- validation results (theater, sandbox, differential)
- feedback for Loop 1
Prerequisites
Before starting Loop 3, ensure Loop 2 completion:
# Verify Loop 2 delivery package exists
test -f .claude/.artifacts/loop2-delivery-package.json && echo "✅ Ready" || echo "❌ Run parallel-swarm-implementation first"
# Load implementation data
npx claude-flow@alpha memory query "loop2_complete" --namespace "integration/loop2-to-loop3"
# Verify GitHub CLI authenticated
gh auth status || gh auth login
8-Step CI/CD Process Overview
Step 1: GitHub Hook Integration (Download CI/CD failure reports)
↓
Step 2: AI-Powered Analysis (Gemini + 7-agent synthesis with Byzantine consensus)
↓
Step 3: Root Cause Detection (Graph analysis + Raft consensus)
↓
Step 4: Intelligent Fixes (Program-of-thought: Plan → Execute → Validate → Approve)
↓
Step 5: Theater Detection Audit (6-agent Byzantine consensus validation)
↓
Step 6: Sandbox Validation (Isolated production-like testing)
↓
Step 7: Differential Analysis (Compare to baseline with metrics)
↓
Step 8: GitHub Feedback (Automated reporting and loop closure)
Step 1: GitHub Hook Integration
Objective: Download and process CI/CD pipeline failure reports from GitHub Actions.
Agent Coordination: Single orchestrator agent manages data collection.
Configure GitHub Hooks
# Install GitHub CLI if needed
which gh || brew install gh
# Authenticate
gh auth login
# Configure webhook listener
gh api repos/{owner}/{repo}/hooks \
-X POST \
-f name='web' \
-f active=true \
-f events='["check_run", "workflow_run"]' \
-f config[url]='http://localhost:3000/hooks/github' \
-f config[content_type]='application/json'
Download Failure Reports
# Get recent workflow runs
gh run list --repo {owner}/{repo} --limit 10 --json conclusion,databaseId \
| jq '.[] | select(.conclusion == "failure")' \
> .claude/.artifacts/failed-runs.json
# Download logs for each failure
cat .claude/.artifacts/failed-runs.json | jq -r '.databaseId' | while read RUN_ID; do
gh run view $RUN_ID --log \
> .claude/.artifacts/failure-logs-$RUN_ID.txt
done
Parse Failure Data
node <<'EOF'
const fs = require('fs');
const failures = [];
// Parse all failure logs
const logFiles = fs.readdirSync('.claude/.artifacts')
.filter(f => f.startsWith('failure-logs-'));
logFiles.forEach(file => {
const log = fs.readFileSync(`.claude/.artifacts/${file}`, 'utf8');
// Extract structured failure data
const failureMatches = log.matchAll(/FAIL (.+?):(\d+):(\d+)\n(.+?)\n(.+)/g);
for (const match of failureMatches) {
failures.push({
file: match[1],
line: parseInt(match[2]),
column: parseInt(match[3]),
testName: match[4],
errorMessage: match[5],
runId: file.match(/failure-logs-(\d+)/)[1]
});
}
});
fs.writeFileSync(
'.claude/.artifacts/parsed-failures.json',
JSON.stringify(failures, null, 2)
);
console.log(`✅ Parsed ${failures.length} failures`);
EOF
Validation Checkpoint:
- ✅ Failure data parsed and structured
- ✅ All required fields present (file, line, testName, errorMessage)
Step 2: AI-Powered Analysis
Objective: Use Gemini large-context analysis + 7 research agents with Byzantine consensus to examine each failure deeply.
Evidence-Based Techniques: Self-consistency, Byzantine consensus, program-of-thought
Phase 1: Gemini Large-Context Analysis
Leverage Gemini's 2M token window for full codebase analysis
# Analyze failures with full codebase context
/gemini:impact "Analyze CI/CD test failures:
FAILURE DATA:
$(cat .claude/.artifacts/parsed-failures.json)
CODEBASE CONTEXT:
Full repository (all files)
LOOP 2 IMPLEMENTATION:
$(cat .claude/.artifacts/loop2-delivery-package.json)
ANALYSIS OBJECTIVES:
1. Identify cross-file dependencies related to failures
2. Detect failure cascade patterns (root → secondary → tertiary)
3. Analyze what changed between working and failing states
4. Assess system-level architectural impact
5. Identify connascence patterns in failing code
OUTPUT FORMAT:
{
dependency_graph: { nodes: [files], edges: [dependencies] },
cascade_map: { root_failures: [], cascaded_failures: [] },
change_analysis: { changed_files: [], change_impact: [] },
architectural_impact: { affected_systems: [], coupling_issues: [] }
}"
# Store Gemini analysis
cat .claude/.artifacts/gemini-response.json \
> .claude/.artifacts/gemini-analysis.json
Phase 2: Parallel Multi-Agent Deep Dive (Self-Consistency)
7 parallel agents for cross-validation and consensus
// PARALLEL ANALYSIS AGENTS - Evidence-Based Self-Consistency
[Single Message - Spawn All 7 Analysis Agents]:
// Failure Pattern Research (Dual agents for cross-validation)
Task("Failure Pattern Researcher 1",
`Research similar failures in external sources:
- GitHub issues for libraries we use
- Stack Overflow questions with similar error messages
- Documentation of known issues
Failures to research: $(cat .claude/.artifacts/parsed-failures.json | jq -r '.[].errorMessage')
For each failure:
1. Find similar reported issues
2. Document known solutions with evidence (links, code examples)
3. Note confidence level (high/medium/low)
Store findings: .cla