3-Round Brainstorm: Claude x Gemini (v2.1 — Two-Layer Architecture)
Setup (first run only)
Before first brainstorm, verify these prerequisites:
-
Find
gemini.py— bundled atscripts/gemini.pyrelative to this SKILL.md. Locate it with:find . ~/.claude/skills -name gemini.py -path '*/gemini/*' 2>/dev/null | head -1Store the result as
GEMINIvariable for the session. -
Check
GOOGLE_API_KEY— must be set. Check withecho $GOOGLE_API_KEY. If empty, look for.envfile and source it, or tell the user to get a key at https://aistudio.google.com -
Check
google-genaipackage — runpython3 -c "from google import genai; print('OK')". If missing, runpip install google-genai.
If any check fails, tell the user what's missing and how to fix it. Do NOT proceed without all three.
Two-layer model architecture: Flash researches, Pro reasons. Claude orchestrates both.
Claude (orchestrator)
| |
Phase 0.5 | | R1 / R2 / R3
Phase 3.5 | |
v v
Flash-Lite --grounded Pro (no grounding)
(research layer) (reasoning layer)
| ^
+--Verified Context-+
Flash-Lite (gemini-3.1-flash-lite-preview, $0.25/$1.50 per 1M in/out) searches the web and produces verified facts.
Pro (gemini-3.1-pro-preview, $2/$12 per 1M in/out) reasons on those facts without wasting tokens on search.
Claude runs its own WebSearch in parallel during Phase 0.5 and Phase 3.5.
Version History
- v1: 3 rounds, no web access. Result: 2/6 decisions invalidated by stale knowledge (A488).
- v2: Added
--groundedto all Pro calls. Problem: Pro wastes thinking tokens on search processing. - v2.1: Two-layer split. Flash does research, Pro focuses on reasoning. Faster, cheaper, better quality.
Why two layers beat one
In v2, Pro with --grounded does both searching AND thinking. But search processing is low-value work — it doesn't need deep reasoning to check "what version is Next.js?" Pro's thinking tokens are expensive and should be spent on challenging ideas, finding blind spots, and synthesizing arguments — not on parsing search results.
Splitting the layers means:
- Flash-Lite (grounded): $0.25/$1.50 per 1M in/out, 3-5 sec per query, returns verified facts
- Pro (ungrounded): all thinking tokens go to critical analysis, gets pre-verified context
Process
Phase 0: Gather Context
Before writing Round 1, check if the user provided enough context. A good R1 prompt needs:
- What exists (product, tech, assets)
- Hard constraints (time, team size, budget, regulatory)
- The goal (what decision needs to be made)
If any of these are missing, ask the user before starting. A brainstorm with vague input produces vague output.
Phase 0.5: GROUNDING (dual research — Claude + Flash)
This phase is mandatory. Extract all technology names, frameworks, APIs, and services from the context.
Two parallel research tracks:
- Claude runs 5-7 WebSearch queries (versions, pricing, compatibility, licenses)
- Flash (grounded) gets a batch research prompt covering the same technologies
Run both in parallel. Merge results into a "Verified Context" block.
Flash research prompt (saved to /tmp/brainstorm-{agent}-ground.txt):
Research the following technologies and provide CURRENT (2026) facts for each.
For each one, include: current version, pricing/free tier, license, known issues.
Use Google Search to verify — do NOT rely on training data.
Technologies to research:
1. [Technology A]
2. [Technology B]
3. [Technology C]
...
Output format per technology:
- Name: current version, release date
- Pricing: free tier details, paid plans
- License: open-source? which license?
- Compatibility: works with [other stack items]?
- Red flags: any known issues, deprecations, EOL dates
Flash call:
python3 $GEMINI ask @/tmp/brainstorm-${AGENT}-ground.txt \
-m gemini-3.1-flash-lite-preview --grounded --save /tmp/brainstorm-${AGENT}-ground-response.md
Merge: Claude reads Flash's response + own WebSearch results, resolves conflicts (if any), produces final Verified Context block:
## Verified Context (dual-checked: Claude WebSearch + Flash Google Search)
- Next.js: v16.1.6 (Oct 2025 release, 16.2 canary). Source: nextjs.org, confirmed by Flash
- Clerk: Free 10K MAU. Pro $25/mo. @clerk/nextjs@6.36.7. Source: clerk.com
- Drizzle: v0.45.x, zero-codegen, Neon HTTP driver. Source: drizzle.team
- CONFLICT: [if Claude and Flash disagree, note both and flag for R1]
Phase 1: DIVERGE (Round 1)
Claude writes a prompt with:
- Verified Context from Phase 0.5 (at the top)
- Full context (what exists, constraints, goals)
- Initial honest assessment (strengths, weaknesses)
- Instructions for Gemini to challenge, not agree
Pro is called WITHOUT --grounded — it receives pre-verified facts and focuses entirely on critical thinking.
R1 prompt template:
You are acting as an adversarial brainstorming partner. I am the lead developer framing the problem.
This is Round 1 of a 3-round brainstorm.
Your job is NOT to be agreeable -- challenge, flip assumptions, add angles I haven't considered.
IMPORTANT: The "Verified Context" section below has been web-checked moments ago.
Treat these facts as ground truth. Focus your energy on STRATEGY and ARCHITECTURE,
not on verifying versions or prices — that's already done.
If you want to recommend a technology NOT in the verified list, flag it clearly
so we can verify it before Round 2.
## Verified Context (web-checked)
[Include Phase 0.5 merged findings]
## Context
[Describe the idea, what you have, constraints, goals]
## My Initial Assessment
[Your honest evaluation: strengths, weaknesses, open questions]
## What I want from you (Round 1)
1. Challenge my framing -- what am I NOT seeing?
2. Propose 3-5 alternative angles/applications
3. Kill at least 1 idea I proposed (with reasoning)
4. Add 1 wildcard idea I haven't considered
5. If you recommend a NEW technology not in Verified Context, flag it for verification
After R1: If Gemini recommended new technologies not in Verified Context, run a quick Flash grounded check on those before proceeding to R2. Tell the user what new angles Gemini generated.
Mid-round verification (only if R1 introduced new technologies):
python3 $GEMINI ask \
"Verify these technologies: [new tech from R1]. Current version, pricing, license, compatibility with [our stack]." \
-m gemini-3.1-flash-lite-preview --grounded --save /tmp/brainstorm-${AGENT}-r1-verify.md
Phase 2: DEEPEN (Round 2)
Claude reads R1 response, then writes R2 prompt that:
- Summarizes Gemini's ideas (including any mid-round verification results)
- Kills ideas with concrete arguments (synthesize, do NOT copy-paste raw R1 output)
- Adds real-world constraints (time, solo dev, regulatory, etc.)
Pro is called WITHOUT --grounded.
R2 prompt template:
This is Round 2 of our brainstorm. Here's what you said in Round 1, followed by my critical evaluation.
Your job now: STRESS-TEST my pushback, pick the 2 strongest surviving threads, and kill everything else.
All technology claims have been web-verified. Focus on architecture and strategy.
## Your Round 1 Key Ideas (summary)
[Summarize Gemini's R1 ideas — synthesize, don't copy-paste]
## My Critical Evaluation
[For each idea: KILL or KEEP with concrete reasoning]
## Constraints you must respect
[Time, resources, regulatory, skill gaps]
## What I want from you (Round 2)
1. Where am I wrong in my kills? You get ONE sentence per killed idea to defend it.
2. Pick the 2 strongest surviving threads (from ideas I kept).
3. For each survivor: concrete 2-week plan.
4. Propose 1 new idea that combines surviving thread