image-gen

Image generation and editing via OpenRouter. Five models, three scripts, style presets, one JSON contract.

Scripts: ./scripts/generate.py, ./scripts/edit.py, ./scripts/review.py Presets: ./presets/*.json Output dir: ./data/

Setup

export OPENROUTER_API_KEY_IMAGES='your-api-key-here'

Claude Code: copy this skill folder into .claude/skills/image-gen/
Codex CLI: append this SKILL.md content to your project's root AGENTS.md

For the full installation walkthrough (prerequisites, API keys, verification, troubleshooting), see references/installation-guide.md.

Credential management

Three tiers for managing the OPENROUTER_API_KEY_IMAGES environment variable:

Vault skill (recommended): If you have a vault or secret-management skill, store the key there and export it before running scripts. Example: export OPENROUTER_API_KEY_IMAGES=$(vault get OPENROUTER_API_KEY_IMAGES)
Custom secret manager: Use your team's preferred secret manager (1Password CLI, AWS Secrets Manager, etc.)
Plain export: export OPENROUTER_API_KEY_IMAGES='your-api-key-here' in your shell profile

Optional keys for additional features:

OPENAI_API_KEY -- for mask-based inpainting via edit.py --mode openai
ANTHROPIC_API_KEY -- for auto-review via review.py --auto

Model Selection

What do you need?
  |
  +-- Fast + cheap + good enough?
  |     --> nanobanana (~$0.0004/image)
  |
  +-- High quality, no text?
  |     --> flux.2-pro (best visual quality)
  |
  +-- Text in the image?
  |     --> gpt-5-image (best text rendering)
  |
  +-- Image editing?
  |     +-- Describe changes in words --> gpt-5-image or nanobanana-pro
  |     +-- Paint mask area to change --> edit.py --mode openai
  |
  +-- Budget generation at scale?
  |     --> flux.2-klein (fastest, cheapest Flux)
  |
  +-- Quality + editing + reasoning?
        --> nanobanana-pro (best balance)

Alias	Type	Cost	Best For
`flux.2-pro`	Image-only	~$0.03/MP	Default high-quality generation
`flux.2-klein`	Image-only	~$0.014/MP	Fast, budget generation
`gpt-5-image`	Text+Image	~$0.04/image	Text rendering, complex edits
`nanobanana-pro`	Text+Image	~$0.012/image	Balanced quality + editing
`nanobanana`	Text+Image	~$0.0004/image	Lowest-cost generation

Full comparison: ./references/model-card.md

Quick Reference

# Generate with default model (flux.2-pro)
python ./scripts/generate.py \
  --prompt "A red fox in snow" \
  --output-dir ./data/

# Generate with style preset
python ./scripts/generate.py \
  --prompt "A scene description for consistent series" \
  --preset default \
  --output-dir ./data/

# Generate with style reference image
python ./scripts/generate.py \
  --prompt "A new scene" \
  --style-ref /path/to/golden-image.png \
  --output-dir ./data/

# Generate with multiple style refs
python ./scripts/generate.py \
  --prompt "Scene desc" \
  --style-ref /path/to/ref1.png \
  --style-ref /path/to/ref2.png

# Generate with system message (GPT-5 / NanoBanana only)
python ./scripts/generate.py \
  --prompt "Scene desc" \
  --model gpt-5-image \
  --system-prompt "You generate muted watercolor illustrations..."

# Generate with prompt upsampling disabled
python ./scripts/generate.py \
  --prompt "Exact scene" \
  --model flux.2-pro \
  --no-prompt-upsampling

# Generate with options (model, aspect ratio, size)
python ./scripts/generate.py \
  --prompt "Tokyo skyline at sunset" \
  --model nanobanana-pro \
  --aspect-ratio 16:9 \
  --size 2K \
  --output-dir ./data/

# Generate with text (GPT-5 Image)
python ./scripts/generate.py \
  --prompt 'Poster with text "HELLO WORLD" in bold sans-serif typography' \
  --model gpt-5-image \
  --output-dir ./data/

# Edit image (chat-based)
python ./scripts/edit.py \
  --mode openrouter \
  --input-image ./data/input.png \
  --prompt "Change the background to a sunset beach" \
  --model gpt-5-image \
  --output-dir ./data/

# Edit image (mask-based)
python ./scripts/edit.py \
  --mode openai \
  --input-image ./data/input.png \
  --mask ./data/mask.png \
  --prompt "Replace masked area with a small bonsai tree" \
  --openai-size 1024x1024 \
  --output-dir ./data/

# Review quality (auto mode)
python ./scripts/review.py \
  --image ./data/output.png \
  --original-prompt "A red fox in snow" \
  --auto

Style Presets

Presets encode visual identity into reusable JSON files. A preset defines palette, composition, rendering style, model defaults, and system messages.

How Presets Work

Pick a preset based on the project context
generate.py --preset <name> loads the preset JSON
The script applies the preset: enhances the prompt with style data, selects model defaults, injects system messages
For Flux models: prompt is constructed as JSON (structured prompt, prevents concept bleeding)
For GPT-5/NanoBanana: style block is prepended as natural language, system message is injected

Available Presets

Preset	File	Description
`default`	`presets/default.json`	No style constraints. Quality-focused defaults.

Preset Schema

{
  "name": "preset-name",
  "description": "What this preset is for",
  "defaults": {
    "model": "flux.2-pro",
    "aspect_ratio": "3:2",
    "size": "2K"
  },
  "style": {
    "description": "Overall style description",
    "color_palette": ["#hex1", "#hex2", "#hex3"],
    "mood": "Emotional tone",
    "lighting": "Lighting description",
    "composition": "Composition rules",
    "rendering": "Rendering constraints",
    "camera": {"angle": "...", "framing": "..."},
    "anti_patterns": ["thing to avoid", "another thing"],
    "reference_images": ["/absolute/path/to/golden.png"]
  },
  "system_message": "System prompt for GPT-5/NanoBanana models"
}

Priority Order (CLI > Preset > Hardcoded)

--model flag overrides preset.defaults.model
--aspect-ratio flag overrides preset.defaults.aspect_ratio
--size flag overrides preset.defaults.size
--system-prompt flag overrides preset.system_message
If no preset and no flag: hardcoded defaults (flux.2-pro, 1:1, 2K)

Creating a New Preset

Copy presets/default.json as a template
Set name and description
Fill defaults with preferred model, aspect ratio, size
Fill style with palette (HEX values), composition rules, rendering constraints
Write system_message for GPT-5/NanoBanana (ignored by Flux)
Optionally add reference_images paths for visual anchoring
Test: python generate.py --prompt "test scene" --preset your-preset

Style References

Use --style-ref to pass reference images for visual anchoring. The script prepends a style transfer instruction automatically.

# Single reference (anchor to a "golden" image)
python ./scripts/generate.py --prompt "New scene" \
  --style-ref /path/to/golden.png

# Multiple references (combine style + content refs, up to 8)
python ./scripts/generate.py --prompt "New scene" \
  --style-ref /path/to/style-ref.png \
  --style-ref /path/to/character-ref.png

Reference images from presets (style.reference_images) are automatically loaded alongside CLI refs.

Full workflow and per-model consistency techniques: ./references/style-consistency.md

System Messages

Use --system-prompt to set persistent style context for GPT-5 Image and NanoBanana models. System messages are injected as the system role, keeping the user prompt focused on scene content only.

python ./scripts/generate.py \
  --prompt "A quiet village at dawn" \
  --model gpt-5-image \
  --system-prompt "You generate muted watercolor illustrations with earth-tone palettes..."

System messages can also be set in presets via the system_message field. CLI --system-prompt overrides preset system m

image-gen

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

webapp-testing

brand-guidelines

frontend-design

web-artifacts-builder

Recibe nuevas skills de Design e Frontend todos los lunes

image-gen

Setup

Credential management

Model Selection

Quick Reference

Style Presets

How Presets Work

Available Presets

Preset Schema

Priority Order (CLI > Preset > Hardcoded)

Creating a New Preset

Style References

System Messages

Comentarios · Sin comentarios