PaperBanana - Academic Illustration Generator

Multi-agent pipeline (Retriever → Planner → Stylist → Visualizer → Critic) for publication-quality academic diagrams, statistical plots, and presentation slides.

API key: Set provider keys in PaperBanana project's .env file. Timeout: 300000 (5 min) for all generation commands.

Commands

All commands run from project root: cd <paperbanana_dir> && python -m paperbanana.cli <cmd>

Command Selection Decision Tree

Route user requests to the right subcommand before looking up parameters:

User intent	Signal words	Subcommand
方法论/架构/流程图 from text or PDF	"method figure", "架构图", "流程图", "methodology", "pipeline diagram", "论文配图"	`generate`
Statistical plot from data file	"plot", "curve", "bar chart", "scatter", "heatmap", has CSV/JSON	`plot`
Single presentation slide	"slide", "一张幻灯片", "封面图", single prompt file	`slide`
Batch slide generation	"all slides", "批量生成", "N 张幻灯片", `prompts/` directory	`slide-batch`
Compare generated vs human reference	"evaluate", "对比", "与参考图对比"	`evaluate`
Manage reference dataset	"download dataset", "清缓存"	`data`
First-time provider config	"setup", "配置 API key"	`setup`

Ambiguous input: If user provides just a description with no subcommand signal, default to generate (see Argument Parsing table for details).

Out-of-scope: Pure code generation (matplotlib/seaborn script) is NOT paperbanana's job — those go to matplotlib / scientific-visualization skills. Paperbanana is for AI-driven image generation + critique loops.

Note (upstream sync pending): Upstream paperbanana CLI also adds subcommands (plot-batch #123, sweep #118) not yet reflected in this table. See the llmsresearch/paperbanana CHANGELOG for the authoritative CLI surface.

`generate` — Methodology Diagrams

python -m paperbanana.cli generate --input '<file>' --caption '<caption>' --optimize --verbose

When user provides inline text (no file): write to temp file, use as --input.

Parameter	Default	Description
`--input` / `-i`	—	Path to methodology text file or PDF (`.pdf` requires `pip install 'paperbanana'`)
`--caption` / `-c`	—	Figure caption / communicative intent
`--output` / `-o`	auto	Output image path
`--vlm-provider`	`gemini`	VLM provider: `gemini`, `anthropic`, `openai`, `bedrock`, `openrouter`, `ollama`, `claude_code`, `litellm`
`--vlm-model`	auto	VLM model name
`--image-provider`	auto	Image gen provider: `google_imagen`, `openai`, `bedrock`, `openrouter`
`--image-model`	auto	Image gen model name
`--iterations` / `-n`	`3`	Max critic rounds
`--auto`	off	Loop until critic is satisfied (safety cap via `--max-iterations`)
`--max-iterations`	`30`	Safety cap for `--auto` mode
`--optimize`	off	Preprocess inputs (parallel enrichment + caption sharpening)
`--continue`	off	Continue from the latest run
`--continue-run`	—	Continue from a specific run ID
`--feedback`	—	User feedback for the critic when continuing a run
`--aspect-ratio` / `-ar`	auto	Target aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `9:16`, `16:9`, `21:9`
`--format` / `-f`	`png`	Output format: `png`, `jpeg`, `webp`
`--dry-run`	off	Validate inputs without making API calls
`--exemplar-retrieval`	off	Enable external exemplar retrieval before planning
`--seed`	—	Random seed for reproducible generation
`--verbose` / `-v`	off	Show detailed agent progress and timing
`--auto-download-data`	off	Auto-download expanded reference set (~257MB) on first run
`--venue`	—	Academic venue style: `neurips`, `icml`, `acl`, `ieee`, `custom`
`--pages`	—	Page range for PDF input (e.g., `3-5`)
`--config`	—	Path to config YAML file

Venue styles: --venue neurips applies NeurIPS-specific methodology and plot style guides from data/guidelines/. Each venue has distinct color palettes, layout conventions, and typography expectations.

PDF input: --input paper.pdf --pages 3-5 extracts text from the specified pages as source context.

Exemplar advanced flags: --exemplar-retrieval enables retrieval; see generate --help for additional config flags (--exemplar-endpoint, --exemplar-mode, --exemplar-top-k, --exemplar-timeout, --exemplar-retries).

`plot` — Statistical Plots

python -m paperbanana.cli plot --data '<data.csv>' --intent '<intent>' --optimize --verbose

Parameter	Default	Description
`--data` / `-d`	—	Path to data file (CSV or JSON) [required]
`--intent`	—	Communicative intent for the plot [required]
`--output` / `-o`	auto	Output image path
`--vlm-provider`	`gemini`	VLM provider
`--iterations` / `-n`	`3`	Refinement iterations
`--format` / `-f`	`png`	Output format
`--aspect-ratio` / `-ar`	auto	Target aspect ratio
`--optimize`	off	Enrich context and sharpen caption
`--auto`	off	Loop until critic satisfied
`--verbose` / `-v`	off	Detailed progress

`slide` — Presentation Slides

python -m paperbanana.cli slide --input '<prompt.md>' --resolution 4k

Parameter	Default	Description
`--input` / `-i`	—	Path to slide prompt markdown file [required]
`--caption` / `-c`	auto	Slide intent description
`--output` / `-o`	auto	Output image path
`--image-model`	auto	Image gen model
`--vlm-model`	auto	VLM model name
`--iterations` / `-n`	`3`	Max critic rounds
`--style` / `-s`	—	Style preset name (see table below)
`--list-styles`	off	List all available style presets and exit
`--resolution` / `-r`	`4k`	Output resolution: `1k`, `2k`, `4k`
`--config`	—	Path to config YAML file

`slide-batch` — Batch Slide Generation

python -m paperbanana.cli slide-batch --prompts-dir '<dir>' --resolution 4k

Parameter	Default	Description
`--prompts-dir`	—	Directory containing slide prompt markdown files [required]
`--output-dir`	auto	Output directory
`--image-model`	auto	Image gen model
`--style` / `-s`	—	Style preset applied to all slides
`--iterations` / `-n`	`3`	Max critic rounds per slide
`--resolution` / `-r`	`4k`	Output resolution

Style Presets (23 available)

Use --style <name> with slide or slide-batch. Use --list-styles to see all.

Style	Source	Best For
`blueprint`	baoyu	Architecture, system design, technical
`chalkboard`	baoyu	Classroom, teaching, education
`corporate`	baoyu	Business, investor, quarterly reports
`minimal`	baoyu	Executive briefings, clean/simple
`sketch-notes`	baoyu	Tutorials, guides, beginner content
`watercolor`	baoyu	Lifestyle, wellness, artistic
`dark-atmospheric`	baoyu	Entertainment, gaming, cinematic
`notion`	baoyu	SaaS, product, dashboards
`bold-editorial`	baoyu	Product launches, keynotes, marketing
`editorial-infographic`	baoyu	Science communication, explainers
`fantasy-animation`	baoyu	Storytelling, magical, children
`intuition-machine`	baoyu	Academic research, bilingual
`pixel-art`	baoyu	Gaming, retro, developer culture
`scientific`	baoyu	Biology, chemistry, medical
`vector-illustration`	baoyu	Creative, children, flat design
`vintage`	baoyu	Historical, heritage, expedition
`tech-keynote`	elite-ppt	Apple/Tesla premium minimalism
`creative-bold`	elite-ppt	Google/Airbnb energetic innovation
`financial-

paperbanana

How to add

Drop this on your repo README

Related skills

pdf

pptx

docx

canvas-design

Get new Documentos skills every Monday

PaperBanana - Academic Illustration Generator

Commands

Command Selection Decision Tree

`generate` — Methodology Diagrams

`plot` — Statistical Plots

`slide` — Presentation Slides

`slide-batch` — Batch Slide Generation

Style Presets (23 available)

Comments · No comments

How to add

Drop this on your repo README

Related skills

pdf

pptx

docx

canvas-design

Get new Documentos skills every Monday

PaperBanana - Academic Illustration Generator

Commands

Command Selection Decision Tree

generate — Methodology Diagrams

plot — Statistical Plots

slide — Presentation Slides

slide-batch — Batch Slide Generation

Style Presets (23 available)

Comments · No comments

`generate` — Methodology Diagrams

`plot` — Statistical Plots

`slide` — Presentation Slides

`slide-batch` — Batch Slide Generation