Orchestration Log: When this skill is activated, append a log entry to outputs/orchestration_log.md:

### Skill Activation: Figure Engine
**Timestamp:** [current date/time]
**Actor:** AI Agent (figure-engine)
**Input:** [brief description of the figure request]
**Output:** [brief description of what was produced — e.g., "Generated methodology diagram (fig03_methodology.png), 3 iterations"]

Figure Engine

Core Principle

Academic papers need professional figures. This skill eliminates manual design work by using PaperBanana to generate publication-quality diagrams and plots from text descriptions or data files. Claude should produce ACTUAL FIGURES, not describe what to draw.

Based on: Zhu, D., Meng, R., Song, Y., Wei, X., Li, S., Pfister, T., & Yoon, J. (2026). PaperBanana: Automating Academic Illustration for AI Scientists. arXiv:2601.23265. The pipeline uses a 5-agent, 2-phase architecture: Retriever → Planner → Stylist (Phase 1: planning), then Visualizer ↔ Critic iterative refinement (Phase 2: generation) with VLM-as-Judge evaluation. Official research repo: dwzhu-pku/PaperBanana.

Prerequisites

PaperBanana must be installed: pip install paperbanana[mcp,google]

A GOOGLE_API_KEY must be available via one of:

Environment variable GOOGLE_API_KEY
.env file in project root
~/.paperbanana.env

Get a free key at https://aistudio.google.com/apikey

Method Priority

Priority Order:

PRIMARY — Direct Python API (via Bash → python3) — ALWAYS use this
FALLBACK — matplotlib/seaborn — If PaperBanana is not installed at all

Note: The PaperBanana MCP server is NOT used. The MCP stdio transport is unreliable (timeouts, hangs, silent failures). Always use the direct Python API.

Method 1: Direct Python API (PRIMARY — Always Use This)

The plugin ships a helper script scripts/paperbanana_direct.py that calls the PaperBanana Python API directly via asyncio.run(), completely bypassing the MCP stdio transport. It outputs JSON to stdout.

Locating the Script

The script is at scripts/paperbanana_direct.py inside the plugin directory. To find it reliably across any installation:

PB_SCRIPT="$(find ~/.claude/plugins -name paperbanana_direct.py -path '*/open-academic-paper-machine/*' 2>/dev/null | head -1)"

Generate Diagram

For short source contexts (< 1000 chars), pass inline:

PB_SCRIPT="$(find ~/.claude/plugins -name paperbanana_direct.py -path '*/open-academic-paper-machine/*' 2>/dev/null | head -1)" && \
python3 "$PB_SCRIPT" diagram \
  --source-context "The research follows a three-stage SLR methodology..." \
  --caption "Figure 1: Systematic Literature Review Process" \
  --output-dir figures/ \
  --filename "fig_method_slr_process.png" \
  --iterations 3

For long source contexts, write to a temp file first to avoid shell escaping issues:

# Step 1: Write source context to temp file
cat > /tmp/pb_source_context.txt <<'CTXEOF'
[FULL METHODOLOGY TEXT / FRAMEWORK DESCRIPTION HERE — can be multiple paragraphs,
include all relevant details about components, relationships, and visual structure]
CTXEOF

# Step 2: Generate the figure
PB_SCRIPT="$(find ~/.claude/plugins -name paperbanana_direct.py -path '*/open-academic-paper-machine/*' 2>/dev/null | head -1)" && \
python3 "$PB_SCRIPT" diagram \
  --source-context "$(cat /tmp/pb_source_context.txt)" \
  --caption "Figure N: Descriptive Caption" \
  --output-dir figures/ \
  --filename "fig_section_description.png" \
  --iterations 3

Generate Plot

PB_SCRIPT="$(find ~/.claude/plugins -name paperbanana_direct.py -path '*/open-academic-paper-machine/*' 2>/dev/null | head -1)" && \
python3 "$PB_SCRIPT" plot \
  --data '{"categories": ["2020","2021","2022","2023","2024"], "values": [12,25,48,89,156]}' \
  --caption "Bar chart showing exponential growth in AI adoption across financial services" \
  --output-dir figures/ \
  --filename "fig_results_adoption_growth.png" \
  --iterations 3

Evaluate Diagram

PB_SCRIPT="$(find ~/.claude/plugins -name paperbanana_direct.py -path '*/open-academic-paper-machine/*' 2>/dev/null | head -1)" && \
python3 "$PB_SCRIPT" evaluate \
  --generated figures/fig_generated.png \
  --reference figures/fig_reference.png \
  --context "Original methodology text" \
  --caption "Figure caption"

Reading the Output

The script prints JSON to stdout:

Success: {"status":"ok","image_path":"figures/fig_name.png","iterations":3,"metadata":{...}}
Error: {"status":"error","message":"..."}

On success, show the figure to the user using the Read tool on the PNG path.

Timeout

PaperBanana generation takes 30-180 seconds (3 refinement iterations). Set a generous Bash timeout of 300 seconds (5 minutes) when calling the script.

Method 2: Python matplotlib/seaborn (FALLBACK)

If PaperBanana is not installed at all, generate figures with Python directly:

import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import seaborn as sns

plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams.update({
    'font.family': 'serif',
    'font.size': 11,
    'axes.titlesize': 13,
    'axes.labelsize': 12,
    'figure.figsize': (10, 6),
    'figure.dpi': 300,
    'savefig.dpi': 300,
    'savefig.bbox_inches': 'tight',
})

For methodology diagrams without PaperBanana, use networkx or graphviz:

import networkx as nx
# Build a directed graph and render with matplotlib

When the User Says "Make Me a Figure" — What to Do

Determine figure type: diagram (methodology, framework, process) or plot (bar, line, scatter, etc.)
Gather input: text description for diagrams, data file for plots
Generate using Direct Python API (Method 1) — ALWAYS use this
If Method 1 fails: fall back to matplotlib/seaborn (Method 2)
Save to figures/ directory in the working folder
Provide LaTeX include snippet ready for copy-paste
Show the generated figure to the user using Read tool on the PNG

Diagram Prompting Guide

Good source context for PaperBanana includes:

What the diagram shows: "This figure illustrates the three-phase research design"
Key components: "Phase 1: Literature Search, Phase 2: Screening, Phase 3: Analysis"
Relationships: "Phase 1 feeds into Phase 2, which filters down to Phase 3"
Style hints: "PRISMA-style flow diagram" or "layered architecture diagram"
Layout direction: "left-to-right flow" or "top-to-bottom hierarchy"

The communicative_intent (caption) should describe WHAT the figure communicates, not just what it contains. Good: "Three-pillar model showing how education, process redesign, and encoded judgment sequentially build sovereign AI mastery." Bad: "Figure 1."

Academic Figure Standards

General Rules

Resolution: Minimum 300 DPI for print
Width: Match journal column width (single column ~8.5cm, double column ~17.5cm)
Font: Serif fonts (Times, Computer Modern) matching paper body
Colors: Use colorblind-friendly palettes (e.g., Okabe-Ito, viridis)
Labels: All axes labeled with units, legend if multiple series
Caption: Descriptive, can stand alone without reading the text

LaTeX Integration

Always provide the complete figure environment:

\begin{figure}[htbp]
\centering
\includegraphics[width=\linewidth]{figures/FILENAME.png}
\caption{DESCRIPTIVE CAPTION}
\label{fig:SHORT_LABEL}
\end{figure}

For side-by-side figures:

\begin{figure}[htbp]
\centering
\begin{minipage}{0.48\textwidth}
  \centering
  \includegraphics[width=\linewidth]{figures/LEFT.png}
  \caption{Left caption}
  \label{fig:left}
\end{minipage}
\hfill
\begin{minipage}{0.48\textwidth}
  \centering
  \includegraphics[width=\linewidth]{figures/RIGHT.pn

figure-engine

How to add

Drop this on your repo README

Related skills

dev-browser

agent-browser

understand-chat

understand-dashboard

Get new Pesquisa e Web skills every Monday