image-gen
Image generation and editing via OpenRouter. Five models, three scripts, style presets, one JSON contract.
Scripts: ./scripts/generate.py, ./scripts/edit.py, ./scripts/review.py
Presets: ./presets/*.json
Output dir: ./data/
Setup
export OPENROUTER_API_KEY_IMAGES='your-api-key-here'
- Claude Code: copy this skill folder into
.claude/skills/image-gen/ - Codex CLI: append this SKILL.md content to your project's root
AGENTS.md
For the full installation walkthrough (prerequisites, API keys, verification, troubleshooting), see references/installation-guide.md.
Credential management
Three tiers for managing the OPENROUTER_API_KEY_IMAGES environment variable:
- Vault skill (recommended): If you have a vault or secret-management skill, store the key there and export it before running scripts. Example:
export OPENROUTER_API_KEY_IMAGES=$(vault get OPENROUTER_API_KEY_IMAGES) - Custom secret manager: Use your team's preferred secret manager (1Password CLI, AWS Secrets Manager, etc.)
- Plain export:
export OPENROUTER_API_KEY_IMAGES='your-api-key-here'in your shell profile
Optional keys for additional features:
OPENAI_API_KEY-- for mask-based inpainting viaedit.py --mode openaiANTHROPIC_API_KEY-- for auto-review viareview.py --auto
Model Selection
What do you need?
|
+-- Fast + cheap + good enough?
| --> nanobanana (~$0.0004/image)
|
+-- High quality, no text?
| --> flux.2-pro (best visual quality)
|
+-- Text in the image?
| --> gpt-5-image (best text rendering)
|
+-- Image editing?
| +-- Describe changes in words --> gpt-5-image or nanobanana-pro
| +-- Paint mask area to change --> edit.py --mode openai
|
+-- Budget generation at scale?
| --> flux.2-klein (fastest, cheapest Flux)
|
+-- Quality + editing + reasoning?
--> nanobanana-pro (best balance)
| Alias | Type | Cost | Best For |
|---|---|---|---|
flux.2-pro | Image-only | ~$0.03/MP | Default high-quality generation |
flux.2-klein | Image-only | ~$0.014/MP | Fast, budget generation |
gpt-5-image | Text+Image | ~$0.04/image | Text rendering, complex edits |
nanobanana-pro | Text+Image | ~$0.012/image | Balanced quality + editing |
nanobanana | Text+Image | ~$0.0004/image | Lowest-cost generation |
Full comparison: ./references/model-card.md
Quick Reference
# Generate with default model (flux.2-pro)
python ./scripts/generate.py \
--prompt "A red fox in snow" \
--output-dir ./data/
# Generate with style preset
python ./scripts/generate.py \
--prompt "A scene description for consistent series" \
--preset default \
--output-dir ./data/
# Generate with style reference image
python ./scripts/generate.py \
--prompt "A new scene" \
--style-ref /path/to/golden-image.png \
--output-dir ./data/
# Generate with multiple style refs
python ./scripts/generate.py \
--prompt "Scene desc" \
--style-ref /path/to/ref1.png \
--style-ref /path/to/ref2.png
# Generate with system message (GPT-5 / NanoBanana only)
python ./scripts/generate.py \
--prompt "Scene desc" \
--model gpt-5-image \
--system-prompt "You generate muted watercolor illustrations..."
# Generate with prompt upsampling disabled
python ./scripts/generate.py \
--prompt "Exact scene" \
--model flux.2-pro \
--no-prompt-upsampling
# Generate with options (model, aspect ratio, size)
python ./scripts/generate.py \
--prompt "Tokyo skyline at sunset" \
--model nanobanana-pro \
--aspect-ratio 16:9 \
--size 2K \
--output-dir ./data/
# Generate with text (GPT-5 Image)
python ./scripts/generate.py \
--prompt 'Poster with text "HELLO WORLD" in bold sans-serif typography' \
--model gpt-5-image \
--output-dir ./data/
# Edit image (chat-based)
python ./scripts/edit.py \
--mode openrouter \
--input-image ./data/input.png \
--prompt "Change the background to a sunset beach" \
--model gpt-5-image \
--output-dir ./data/
# Edit image (mask-based)
python ./scripts/edit.py \
--mode openai \
--input-image ./data/input.png \
--mask ./data/mask.png \
--prompt "Replace masked area with a small bonsai tree" \
--openai-size 1024x1024 \
--output-dir ./data/
# Review quality (auto mode)
python ./scripts/review.py \
--image ./data/output.png \
--original-prompt "A red fox in snow" \
--auto
Style Presets
Presets encode visual identity into reusable JSON files. A preset defines palette, composition, rendering style, model defaults, and system messages.
How Presets Work
- Pick a preset based on the project context
generate.py --preset <name>loads the preset JSON- The script applies the preset: enhances the prompt with style data, selects model defaults, injects system messages
- For Flux models: prompt is constructed as JSON (structured prompt, prevents concept bleeding)
- For GPT-5/NanoBanana: style block is prepended as natural language, system message is injected
Available Presets
| Preset | File | Description |
|---|---|---|
default | presets/default.json | No style constraints. Quality-focused defaults. |
Preset Schema
{
"name": "preset-name",
"description": "What this preset is for",
"defaults": {
"model": "flux.2-pro",
"aspect_ratio": "3:2",
"size": "2K"
},
"style": {
"description": "Overall style description",
"color_palette": ["#hex1", "#hex2", "#hex3"],
"mood": "Emotional tone",
"lighting": "Lighting description",
"composition": "Composition rules",
"rendering": "Rendering constraints",
"camera": {"angle": "...", "framing": "..."},
"anti_patterns": ["thing to avoid", "another thing"],
"reference_images": ["/absolute/path/to/golden.png"]
},
"system_message": "System prompt for GPT-5/NanoBanana models"
}
Priority Order (CLI > Preset > Hardcoded)
--modelflag overridespreset.defaults.model--aspect-ratioflag overridespreset.defaults.aspect_ratio--sizeflag overridespreset.defaults.size--system-promptflag overridespreset.system_message- If no preset and no flag: hardcoded defaults (flux.2-pro, 1:1, 2K)
Creating a New Preset
- Copy
presets/default.jsonas a template - Set
nameanddescription - Fill
defaultswith preferred model, aspect ratio, size - Fill
stylewith palette (HEX values), composition rules, rendering constraints - Write
system_messagefor GPT-5/NanoBanana (ignored by Flux) - Optionally add
reference_imagespaths for visual anchoring - Test:
python generate.py --prompt "test scene" --preset your-preset
Style References
Use --style-ref to pass reference images for visual anchoring. The script prepends a style transfer instruction automatically.
# Single reference (anchor to a "golden" image)
python ./scripts/generate.py --prompt "New scene" \
--style-ref /path/to/golden.png
# Multiple references (combine style + content refs, up to 8)
python ./scripts/generate.py --prompt "New scene" \
--style-ref /path/to/style-ref.png \
--style-ref /path/to/character-ref.png
Reference images from presets (style.reference_images) are automatically loaded alongside CLI refs.
Full workflow and per-model consistency techniques: ./references/style-consistency.md
System Messages
Use --system-prompt to set persistent style context for GPT-5 Image and NanoBanana models. System messages are injected as the system role, keeping the user prompt focused on scene content only.
python ./scripts/generate.py \
--prompt "A quiet village at dawn" \
--model gpt-5-image \
--system-prompt "You generate muted watercolor illustrations with earth-tone palettes..."
System messages can also be set in presets via the system_message field. CLI --system-prompt overrides preset system m