Review Security — White-Box Security Audit
Orchestrates a comprehensive security assessment of the project's source code using both defensive and offensive analysis. A blue-teamer and a lead red-teamer run in parallel, in isolation — neither sees the other's output during the first pass. The orchestrator then synthesizes their territories into a unified target list with four prescriptive categories, surfacing what each team alone would have missed. Focused red-teamers deep-dive each target. Findings are synthesized, exploit chains are explored, and the process iterates until no new chains emerge.
This is deliberately heavy. Thoroughness is the priority, not speed. A complete audit may spawn many agents and take significant time. That's the point — shallow security reviews miss the vulnerabilities that matter.
Advisory only. The skill produces findings and proposes tickets; it does not implement remediation. The cognitive seam between "find vulnerability" and "fix vulnerability" is wide enough that mixing them under one workflow degrades both — security findings require fresh threat-model reasoning to remediate correctly, and the discovery agents shouldn't be biased toward findings they could easily fix. Tickets capture findings durably across that seam and compose with /implement and /implement-project for remediation.
The parallel-isolated first pass is the load-bearing discipline of this skill. The previous design ran blue first, then red informed by blue. That design embedded an anchoring failure mode: whatever the blue team flagged as "the defensive territory" became the salient territory for the red team. Real attackers don't get a defensive briefing — they look at the system fresh and find what defenders missed. Independent reconnaissance surfaces the territory the old design suppressed.
Workflow Overview
┌──────────────────────────────────────────────────────┐
│ AUDIT WORKFLOW │
├──────────────────────────────────────────────────────┤
│ 1. Determine scope │
│ 2. Independent first pass (parallel, isolated) │
│ ├─ Blue-teamer (defense evaluation) │
│ │ └─ Output: control inventory + gaps + depth │
│ └─ Lead red-teamer (reconnaissance) │
│ └─ Input: scope only (no blue-team output) │
│ └─ Output: attack surface + target list │
│ 3. Reconnaissance synthesis │
│ ├─ Categorize: anchoring-suppressed, │
│ │ convergent, blue-flagged-unverified, │
│ │ divergent │
│ └─ Output: unified target list (≤25) │
│ 4. For each target on unified list: │
│ └─ Spawn focused red-teamer (deep investigation) │
│ └─ Includes blue-team context iff target │
│ origin includes blue-team data │
│ 5. Findings synthesis + chain analysis │
│ ├─ If exploit chains found → goto 4 (new vector) │
│ └─ If no new chains → proceed │
│ 6. Present consolidated findings to user │
│ 7. Cut tickets (proposed structure, operator-approved) │
└──────────────────────────────────────────────────────┘
Workflow Details
1. Determine Scope
Default: Production code only. The following are excluded by default:
- Test code (test files, test fixtures, test helpers)
- Dev-only dependencies and tooling (build tools, linters, bundler configs)
- Generated code, vendored code
Inform the user of these exclusions when presenting the scope. If the user wants to include any of them, respect that.
If user specifies scope: Respect it (directory, files, module, feature area). Pass scope to all spawned agents.
Ask the user:
- "What is the scope of the audit?" (entire codebase, specific module, specific feature)
- "Is there anything you're particularly concerned about?" (auth, file handling, a recent change, etc.)
- "Are there any areas I should skip beyond the defaults?" (additional exclusions)
User concerns inform the prioritization of vectors in later steps, but the blue-teamer and lead red-teamer still perform full analysis — user intuition supplements, not replaces, systematic analysis.
2. Independent First Pass — Parallel and Isolated
Spawn the blue-teamer and lead red-teamer in parallel. Neither agent sees the other's output during this phase. This is the load-bearing discipline of the skill: independent reconnaissance prevents the blue team's defensive map from anchoring the red team's attack planning, and surfaces the territory each side alone would miss.
2a. Blue-Teamer — Defense Evaluation
Spawn a sec-blue-teamer agent for full defense evaluation:
You are the blue-teamer for a white-box security audit. You are running in
parallel with the lead red-teamer; you will not see their output during
this phase. Perform your evaluation from defenders' first principles.
Scope: [entire codebase | user-specified scope]
User concerns: [any areas of concern mentioned by user, or "none specified"]
Perform your full methodology:
1. Inventory security controls — map every defense that exists (auth, authz,
input validation, CSRF, headers, rate limiting, crypto, secrets, logging)
2. Evaluate each control — correctness, consistency, failure mode
3. Identify missing controls — what should exist but doesn't, given the
application type?
4. Assess defense-in-depth — where does security rely on a single control?
5. Review configuration — are security features properly configured?
6. Dependency hygiene — run available tooling, check for CVEs and supply chain
concerns
7. Secrets and credentials — check for secrets in the wrong places
Pay special attention to CONSISTENCY. Gaps where a control exists but isn't
applied universally are the highest-value defensive findings.
Output your full report in your standard format.
2b. Lead Red-Teamer — Independent Reconnaissance
Spawn a sec-red-teamer agent in broad recon mode. Do not pass any blue-team output. The lead red-teamer in this phase performs reconnaissance with no defensive briefing — pure attacker perspective on the codebase.
You are the lead red-teamer for a white-box security audit. You are running
in parallel with the blue-teamer; you will not see their output during this
phase. Perform reconnaissance from attackers' first principles — fresh eyes
on the codebase, no defensive briefing.
Scope: [entire codebase | user-specified scope]
User concerns: [any areas of concern mentioned by user, or "none specified"]
Perform phases 1–3 of your methodology:
1. Reconnaissance — map the full attack surface (every entry point, what it
accepts, who can reach it).
2. Data flow tracing — for each entry point, trace input to its final
destination.
3. Trust boundary mapping — identify where trust transitions occur, including
implicit / unguarded ones.
Do NOT perform deep exploitation yet. Your job is to survey the landscape and
produce a prioritized target list.
Output a structured report:
## ATTACK SURFACE
[Entry points discovered, ranked by exposure]
## TRUST BOUNDARIES
[Trust boundaries identified, noting implicit/unguarded ones]
## TARGET LIST
For each promising attack vector, provide:
- Target: [entry point or code path]
- Files: [specific files and line ranges to focus on]
- Hypothesis: [what you think might be exploitable and why]
- Context: [relevant framework protections, validation observed, transformations]
- Priority: [CRITICAL / HIGH / MEDIUM / LOW]
- Investigation approach: [what the focused red-teamer should try]
Rank targets by a combination of exposure (how easy to reach) and potential
impact (how bad if exploited). Include up to 25 targets — but this is a
MAXIMUM, not a quota. Report only targets that genuinely warrant
investigation. A short list is fine. An empty list means the codebase is
w