Deep Brainstorming

Research-hardened brainstorming that catches biases agents naturally introduce. Collapses what would otherwise be 3-4 sessions of iterative discovery into one disciplined session.

REQUIRED: Run a standard brainstorming pass first (e.g., superpowers:brainstorming or equivalent). This skill augments brainstorming output — it does not replace it.

When to Use

digraph when {
  "Starting brainstorming?" [shape=diamond];
  "Multiple valid tech choices?" [shape=diamond];
  "Wrong choice costly to reverse?" [shape=diamond];
  "Use regular brainstorming" [shape=box];
  "Use deep brainstorming" [shape=box, style=bold];

  "Starting brainstorming?" -> "Multiple valid tech choices?";
  "Multiple valid tech choices?" -> "Use regular brainstorming" [label="no"];
  "Multiple valid tech choices?" -> "Wrong choice costly to reverse?";
  "Wrong choice costly to reverse?" -> "Use regular brainstorming" [label="no — easy to swap"];
  "Wrong choice costly to reverse?" -> "Use deep brainstorming" [label="yes"];
}

Intensity Selector

Pick intensity based on stakes and time budget. Default to Standard unless stakes clearly call for Quick or Thorough.

Level	Research Rounds	Bias Audit	Review Phases	When
Quick	1 round, 2-3 agents	Spot-check top recommendation	Verify-fix only	Low-stakes, easily reversible
Standard	2 rounds, 3-4 agents each	Full 9-type audit after each round	Verify-fix + adversarial	Default for most architecture decisions
Thorough	3-4 rounds, 4-6 agents each	Full audit + retroactive sweep	All 3 phases + independent re-verification	High-stakes, expensive to reverse, novel domain

Process Overview

Phase	What	Why
1. Sanitize vision	Strip tool/vendor names from requirements	Prevents anchoring bias in research
2. Research rounds	Parallel agents with clean prompts, progressive debiasing	Catches marketing/popularity bias
3. Bias audit	Check each round's output against 9 bias types	Agents over-represent popular tools
4. Independent verify	Check key claims via web/docs yourself	Catches hallucinated benchmarks
5. Claim provenance	Record source, quote, verification method for every cited number	Prevents stat drift across sessions
6. Assemble spec	Progressive file capture, one section at a time	Survives compaction
7. Three-phase review	Verify-fix, adversarial, security/ops	Each type catches different classes of bugs
8. Quality gate	"Is this objectively best?" challenge on final spec	Catches premature satisfaction

Phase 1: Sanitize the Vision

Extract the client brief into a clean requirements document. This is the anchor for all research.

Strip: All tool names, frameworks, architecture patterns, vendor references. Keep: What it does, who uses it, hard constraints (language, platform, compliance). Save as: <project-docs-dir>/<project>-vision.md

Scoping discipline: Only include requirements the client actually stated. If a requirement wasn't in the brief, it doesn't belong in the vision. Agents inject phantom requirements from training data (compliance frameworks, accessibility standards, monitoring stacks). Challenge every requirement: "Did the client ask for this, or did an agent add it?"

Non-Requirements section: Explicitly list what the client did NOT ask for in the vision doc. This is the primary defense against phantom requirements — research agents that recommend solutions to Non-Requirements get flagged immediately.

Phase 2: Research with Clean Prompts

Prompt rules:

Zero vendor/tool names — only functional requirements + vision doc
"What's the best way to achieve Y?" not "Is X good for Y?"
Include: "Verify non-library claims via web search. Do not rely on training data."
Attach the sanitized vision document to every prompt

Agent discipline (mandatory for ALL researcher agents):

Synthesize first: Every researcher prompt MUST include: "SYNTHESIZE FIRST — write your findings report before doing more searches. You can always search more after writing what you know." Without this, agents spend 80%+ of their token budget on searches and never produce output. (Evidence: 60% researcher agent failure rate observed from token exhaustion on web-search-heavy tasks.)
Max 5 questions per agent. Split larger research across parallel agents. One agent with 10 questions will exhaust tokens on the first 5 and never synthesize. Each agent gets 3-5 focused questions.
Structured doc tools for library docs. Prefer structured documentation sources (e.g., Context7 MCP, official API docs) over generic web search for library API verification. Structured docs return focused content at lower token cost than web search chains.
Relaunch failed agents. When a researcher fails to synthesize (returns partial output like "Let me search for more..."), relaunch with a shorter prompt. Doing the research inline burns 5-10x more main context.

Round structure:

Split research by domain (e.g., backend, frontend, data pipeline, infrastructure, NLP)
One agent per domain per round, all running in parallel
Round 1: broad discovery — "What are the top approaches for [functional need]?"
Round 2: verification with fresh prompts that do NOT reference Round 1 findings. Ask the same functional questions differently. Compare convergence.
Round 3+ (Thorough only): targeted investigation of divergences between rounds

Consensus guard: If all agents in a round converge on the same tool, that's a red flag — not validation. Force dissent: "What's the strongest alternative to [consensus pick] and under what conditions would it win?"

Prompt review cycle: Before launching each round, present prompts to the user for review and revision. Agents internalize biased framing invisibly — the user catches phrasing that anchors research. Specific things to check: did any functional description smuggle in a tool name? Did "agentic orchestration" imply a specific architecture? Soften loaded terms to neutral descriptions.

Full reset option (Thorough): For the final research round, discard all prior findings: "All previous research counts as zero. Start from the requirements only." This prevents anchoring to earlier rounds' conclusions and surfaces genuinely different approaches.

Convergence table: After each round, produce a cross-round comparison table showing every decision layer with what each round recommended. Items that converge across rounds get high confidence. Items that diverge get "benchmark to decide" status. This table is the primary decision tool — not any single round's output.

For detailed prompt templates, see references/research-prompts.md.

Phase 3: Bias Audit

Check after EVERY research round. Audit the full output, not just the recommendations.

Bias	Signal	Detection
Marketing	Most-blogged tool recommended as best	Check: is the recommendation backed by benchmarks or by blog post count?
Popularity	Most-downloaded/starred treated as winner	Check: do downloads measure quality or awareness?
License	Defaulting to OSS or to commercial without comparison	Check: was the full landscape evaluated, or just one license type?
Information landscape	Domain keywords trigger unrelated associations	Check: is the recommendation actually relevant to THIS use case?
Training data	Stale versions, deprecated tools, renamed SDKs	Check: verify version numbers and project status against official sources
Vendor benchmark	Performance numbers sourced from the tool's own maker	Check: find independent verification or note as unverified
Hallucinated evidence	Specific benchmark numbers or repo URLs that don't exist

deep-brainstorming

How to add

Drop this on your repo README

Related skills

learn-codebase

remove-deadcode

apify-competitor-intelligence

ad-creative

Get new Marketing skills every Monday