Academic Plotting for ML Papers
Generate publication-quality figures for ML/AI conference papers. Two distinct workflows:
- Diagram figures (architecture, system design, workflows, pipelines) — AI image generation via Gemini
- Data figures (line charts, bar charts, scatter plots, heatmaps, ablations) — matplotlib/seaborn
When to Use Which Workflow
| Figure Type | Tool | Why |
|---|---|---|
| Architecture / system diagram | Gemini (Workflow 1) | Complex spatial layouts with boxes, arrows, labels |
| Workflow / pipeline / lifecycle | Gemini (Workflow 1) | Multi-step processes with connections |
| Bar chart, line plot, scatter | matplotlib (Workflow 2) | Precise numerical data, reproducible |
| Heatmap, confusion matrix | matplotlib/seaborn (Workflow 2) | Structured grid data |
| Ablation table as chart | matplotlib (Workflow 2) | Grouped bars or line comparisons |
| Pie / donut chart | matplotlib (Workflow 2) | Proportional data (use sparingly in ML papers) |
| Training curves | matplotlib (Workflow 2) | Loss/accuracy over steps/epochs |
Rule of thumb: If the figure has numerical axes, use matplotlib. If the figure has boxes and arrows, use Gemini.
Step 0: Context Analysis & Extraction
The user will typically provide one of these inputs — not a ready-made specification:
| Input Type | Example | What to Extract |
|---|---|---|
| Full paper / section draft | "Here's our method section..." | System components, their relationships, data flow |
| Description paragraph | "Our system has three layers that..." | Key entities, hierarchy, connections |
| Raw results / data table | "MMLU: 85.2, HumanEval: 72.1..." | Metrics, methods, comparison structure |
| CSV / JSON data | Experiment log files | Variables, trends, grouping dimensions |
| Vague request | "Make a figure for the overview" | Read surrounding paper context to infer content |
Extraction Workflow
For diagrams (research context → architecture figure):
- Read the provided context — paper section, abstract, or description paragraph
- Identify visual entities — What are the main components/modules/stages?
- Look for: nouns that represent system parts, named modules, layers, stages
- Count them: if >8 top-level entities, consider grouping into sections
- Identify relationships — How do components connect?
- Look for: verbs describing data flow ("sends to", "queries", "feeds into")
- Classify: data flow (solid arrow), control flow (gray), error path (dashed red)
- Determine layout pattern:
- Sequential pipeline → left-to-right flow
- Layered architecture → horizontal bands stacked vertically
- Hub-and-spoke → central node with radiating connections
- Hierarchical → top-down tree
- Assign colors — One accent color per logical group/layer
- Write every label exactly — Extract exact terminology from the paper text
For data charts (results → figure):
- Read the provided data — table, paragraph with numbers, CSV, or JSON
- Identify dimensions:
- What is being compared? (methods, models, configurations) → categorical axis
- What is the metric? (accuracy, loss, latency, F1) → value axis
- Is there a time/step dimension? → line plot
- Are there multiple metrics? → multi-panel or grouped bars
- Choose chart type automatically using this priority:
- Has a step/time axis → line plot
- Comparing N methods on M benchmarks → grouped bar chart
- Single ranking → horizontal bar (leaderboard)
- Correlation between two continuous variables → scatter plot
- Square matrix of values → heatmap
- Proportional breakdown → stacked bar (avoid pie charts)
- Determine figure sizing — Single column vs full width based on data density
- Highlight "our method" — Identify which entry is the paper's contribution and give it a distinct color
Auto-Detection Examples
Context → Diagram: "Our system has a Planner, Executor, and Verifier. Planner sends plans to Executor, Executor returns results to Verifier, Verifier feeds back to Planner on failure." → 3 entities, cycle layout, dashed feedback arrow → Workflow 1 (Gemini)
Data → Chart: "GPT-4: MMLU 86.4, HumanEval 67.0. Ours: 88.1, 71.2. Llama-3: 79.3, 62.1." → 3 methods × 2 benchmarks → Workflow 2 (grouped bar), highlight "Ours" in coral
Workflow 1: Architecture & System Diagrams (AI Image Generation)
Use Gemini 3 Pro Image Preview to generate diagrams. Choose a visual style first — this is the single biggest factor in whether the figure looks professional or generic.
Visual Styles
Pick one style per paper (all figures should be consistent):
Style A: "Sketch / 简笔画" (Hand-Drawn)
Warm, approachable, memorable. Ideal for overview figures and system introductions. Looks like a whiteboard sketch refined by a designer.
VISUAL STYLE — HAND-DRAWN SKETCH:
- Slightly irregular, hand-drawn line quality — lines wobble gently, not perfectly straight
- Rounded, soft shapes with visible pen strokes (like drawn with a thick felt-tip marker)
- Warm off-white background (#FAFAF7), NOT pure white
- Fill colors are soft watercolor-like washes: muted blue (#D6E4F0), soft peach (#F5DEB3),
light sage (#D4E6D4), pale lavender (#E6DFF0)
- Borders are dark charcoal (#2C2C2C) with 2-3px line weight, slightly uneven
- Arrows are hand-drawn with slight curves, ending in simple open arrowheads (not filled triangles)
- Text uses a rounded sans-serif font (like Comic Neue or Architects Daughter feel)
- Small doodle-style icons inside boxes: a tiny gear ⚙ for processing, a lightbulb 💡 for ideas,
a magnifying glass 🔍 for search — rendered as simple line drawings, NOT emoji
- Overall feel: a carefully drawn whiteboard diagram, clean but with personality
- NO clip art, NO stock icons, NO photorealistic elements
Style B: "Modern Minimal" (Clean & Bold)
Confident, authoritative. Best for method figures where precision matters.
VISUAL STYLE — MODERN MINIMAL:
- Ultra-clean geometric shapes with crisp edges
- Bold color blocks as backgrounds for sections — NOT just accent bars, but full section fills
using desaturated tones: slate blue (#E8EDF2), warm sand (#F5F0E8), cool mint (#E8F2EE)
- Component boxes have ROUNDED CORNERS (12px radius), NO visible border — they float on
the section background using subtle shadow (1px, 4px blur, rgba(0,0,0,0.06))
- ONE accent color per section used sparingly on key elements: Deep blue (#2563EB),
Emerald (#059669), Amber (#D97706), Rose (#E11D48)
- Arrows are thin (1.5px), dark gray (#6B7280), with small filled circle at source
and clean arrowhead at target — NOT thick colored arrows
- Typography: Inter or system sans-serif, title 600 weight, body 400 weight
- Labels INSIDE boxes, not beside them
- Generous whitespace — at least 24px between elements
- NO decorative elements, NO icons — let the structure speak
Style C: "Illustrated Technical" (Icon-Rich)
Engaging, explanatory. Good for tutorial-style papers and figures that need to be self-explanatory.
VISUAL STYLE — ILLUSTRATED TECHNICAL:
- Each major component has a small MEANINGFUL ICON drawn in a consistent line-art style
(single color, 2px stroke, ~24x24px): brain icon for reasoning, database cylinder for storage,
arrow-loop for iteration, network nodes for communication
- Components sit inside soft rounded rectangles with a LEFT COLOR STRIP (4px wide)
- Background is pure white, but each logical group has a very faint colored region behind it
(#F8FAFC for blue group, #FFF8F0 for orange group)
- Connections use CURVED bezier paths (not straight lines), colored by SOURCE component
- Key data flows are THICKER (3px) than secondary flows (1px, dashed)
- Small annotation badges on arrows: "×N" for repeated operations, "optional" in italics
- Title labels are ABOVE each section in small caps, letter-spaced
- Overall: l