Paper Illustration: Multi-Stage Claude-Supervised Figure Generation
Generate publication-quality illustrations using a multi-stage workflow with Claude as the STRICT supervisor/reviewer.
Core Design Philosophy
┌──────────────────────────────────────────────────────────────────────────┐
│ MULTI-STAGE ITERATIVE WORKFLOW │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ User Request │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Claude │ ◄─── Step 1: Parse request, create initial prompt │
│ │ (Planner) │ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Gemini │ ◄─── Step 2: Optimize layout description │
│ │ (gemini-3-pro)│ - Refine component positioning │
│ │ Layout │ - Optimize spacing and grouping │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Gemini │ ◄─── Step 3: CVPR/NeurIPS style verification │
│ │ (gemini-3-pro)│ - Check color palette compliance │
│ │ Style │ - Verify arrow and font standards │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Paperbanana │ ◄─── Step 4: Render final image │
│ │ (gemini-3- │ - High-quality image generation │
│ │ pro-image) │ - Internal codename: Nano Banana Pro │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Claude │ ◄─── Step 5: STRICT visual review + SCORE (1-10) │
│ │ (Reviewer) │ - Verify EVERY arrow direction │
│ │ STRICT! │ - Verify EVERY block content │
│ └──────┬──────┘ - Verify aesthetics & visual appeal │
│ │ │
│ ▼ │
│ Score ≥ 9? ──YES──► Accept & Output │
│ │ │
│ NO │
│ │ │
│ ▼ │
│ Generate SPECIFIC improvement feedback ──► Loop back to Step 2 │
│ │
└──────────────────────────────────────────────────────────────────────────┘
Constants
- IMAGE_MODEL =
gemini-3-pro-image-preview— Paperbanana (Nano Banana Pro) for image rendering - REASONING_MODEL =
gemini-3-pro-preview— Gemini for layout optimization and style checking - MAX_ITERATIONS = 5 — Maximum refinement rounds
- TARGET_SCORE = 9 — Minimum acceptable score (1-10) — RAISED FOR QUALITY
- OUTPUT_DIR =
figures/ai_generated/— Output directory - API_KEY_ENV =
GEMINI_API_KEY— Environment variable
Optional: Style reference (— style-ref: <source>, opt-in)
Lets the user steer structural figure conventions (caption length, panel-count distribution, figure-to-table ratio in the parent paper) toward a reference paper. Default OFF — when the user does not pass — style-ref, do nothing differently from before.
Only when — style-ref: <source> appears in $ARGUMENTS, run the helper FIRST, before generating prompts:
# Resolve $STYLE_HELPER via the canonical strict-safe chain (see
# shared-references/integration-contract.md §2). Policy A — gate:
# unresolved helper means --style-ref cannot be satisfied, so abort.
cd "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" || exit 1
if [ -z "${ARIS_REPO:-}" ] && [ -f .aris/installed-skills.txt ]; then
ARIS_REPO=$(awk -F'\t' '$1=="repo_root"{print $2; exit}' .aris/installed-skills.txt 2>/dev/null) || true
fi
STYLE_HELPER=".aris/tools/extract_paper_style.py"
[ -f "$STYLE_HELPER" ] || STYLE_HELPER="tools/extract_paper_style.py"
[ -f "$STYLE_HELPER" ] || { [ -n "${ARIS_REPO:-}" ] && STYLE_HELPER="$ARIS_REPO/tools/extract_paper_style.py"; }
[ -f "$STYLE_HELPER" ] || {
echo "ERROR: extract_paper_style.py not resolved at .aris/tools/, tools/, or \$ARIS_REPO/tools/." >&2
echo " Fix: rerun bash tools/install_aris.sh, export ARIS_REPO, or copy the helper to tools/." >&2
echo " --style-ref cannot be satisfied; aborting." >&2
exit 1
}
STYLE_STATUS=0
CACHE=$(python3 "$STYLE_HELPER" --source "<source>") || STYLE_STATUS=$?
case "$STYLE_STATUS" in
0) ;; # use $CACHE/style_profile.md as structural guidance
2) echo "warning: style-ref skipped (missing optional dep)" >&2 ;;
3) echo "error: --style-ref source failed; aborting illustration" >&2 ; exit 1 ;;
*) echo "error: helper failed unexpectedly; aborting illustration" >&2 ; exit 1 ;;
esac
Sources accepted: local TeX dir / file, local PDF, arXiv id, http(s) URL. Overleaf URLs/IDs are rejected — clone via /overleaf-sync setup <id> first and pass the local clone path.
Strict rules (full contract in tools/extract_paper_style.py docstring):
- Use
style_profile.mdto align caption length and figure density with the reference paper. The CVPR/ICLR/NeurIPS visual standards above still take precedence —--style-refonly refines length-and-density tendencies, never image content. - Never copy figure content, color palettes, or specific design elements from anything reachable through the cache. The visual design comes from the user's prompt, not the reference.
- Never pass
— style-ref(or the cache contents) to the Claude vision-checker / Gemini reasoning-checker sub-agents when they score the generated image — the image must be judged on its own merits.
CVPR/ICLR/NeurIPS Top-Tier Conference Style Guide
What "CVPR Style" Actually Means:
Visual Standards
- Clean white background — No decorative patterns or gradients (unless subtle)
- Sans-serif fonts — Arial, Helvetica, or Computer Modern; minimum 14pt
- Subtle color palette — Not rainbow colors; use 3-5 coordinated colors
- Print-friendly — Must be readable in grayscale (many reviewers print papers)
- Professional borders — Thin (2-3px), solid colors, not flashy
Layout Standards
- Horizontal flow — Left-to-right is the standard for pipelines
- Clear grouping — Use subtle background boxes to group related modules
- Consistent sizing — Similar components should have similar