Image Generation
Guide for selecting the right image generation tool through the Hyper MCP based on what the user is trying to make.
Requirements
This skill assumes the Hyper MCP is connected to your agent so the tools below are available. For brand-consistent ad creative work, Firecrawl must also be configured under your Hyper integrations.
Tool surface
| Group | Tools |
|---|---|
OpenAI (gpt-image-2) | openai_image_generation, openai_image_edit |
| Nano Banana (Gemini) | nano_banana_image_generation, nano_banana_image_edit, nano_banana_multi_turn |
| Seedream (ByteDance via fal.ai) | seedream_image_generation |
| Brand extraction (companion) | firecrawl_extract_branding |
Out of scope
- Full ad creative workflows (brand extraction → copy → image) — use
ad-creative-generation. - Video generation, transcription, captions, voiceover — use
video-generation.
Tool Selection Framework
Rule 0: Website / brand ad creative requests → extract branding first (Firecrawl)
If the user provides a website URL and asks for ad creatives, brand-consistent visuals, or marketing assets, always extract branding before generating images.
- Call
firecrawl_extract_brandingwith the URL. The result returns brand colors, fonts, personality / tone, and saved image files (logo, favicon, og_image). - Optionally call
firecrawl_scrape_urlwithformats=["screenshot"]to capture the landing page for visual context. - Compose your image generation prompt using the extracted colors (hex codes), font names, and personality tone. Pass the logo
file_idfrom the result as areference_image_file_id.
Critical: The result includes a file field — this is a JSON data file, NOT an image. Never pass it to image gen tools. Only logo.file_id and images.*.file_id are actual images you can use as references.
Rule 1: Explicit text-heavy assets → Nano Banana Pro (Gemini)
If the primary requirement is readable embedded text inside the image itself, such as headlines, labels, UI text, or poster copy:
nano_banana_image_generation(
model="pro",
requests=[{"id": "ad1", "prompt": "Create an ad with readable text '50% OFF'"}],
aspect_ratio="16:9",
image_size="2K"
)
Rule 2: Reference images or brand assets → OpenAI Edit by default
If the user provides images, branding assets, screenshots, or wants an initial branded composition, prefer openai_image_edit.
openai_image_edit(
requests=[{"prompt": "Compose into gift basket", "reference_images": ["file1", "file2"]}]
)
Use Nano Banana only when the user explicitly wants Nano Banana or the image is text-heavy enough that text rendering is the main concern.
Rule 3: Initial ad creative or first-pass concepts → OpenAI
For initial creative exploration, first-pass ad concepts, or early branded directions, prefer OpenAI tools:
openai_image_generation(
requests=[{"prompt": "A polished SaaS ad concept with clean composition"}],
size="1024x1024"
)
For reference-based branded ads, prefer:
openai_image_edit(
requests=[{"prompt": "Create a polished branded ad concept", "reference_images": ["file1", "file2"]}]
)
Rule 4: Nano Banana Pro for production text rendering
Use nano_banana_image_generation(model="pro") when the user explicitly wants Nano Banana or when the output needs strong text rendering inside the image. Do not default to Nano Banana for the first creative pass when OpenAI tools can handle the concept.
Available Tools
OpenAI Image Generation
Best for: Quick ideation, first-pass concepts, general creative requests without reference images.
- Model:
gpt-image-2 - Standard sizes: 1024x1024, 1024x1536, 1536x1024
- Quality:
low,medium,high - Fast results for general creative requests
OpenAI Image Edit
Best for: Initial branded compositions, website-based ad creatives, image-to-image generation with OpenAI quality.
- Editing or composing images using references
- Multi-image composition
Nano Banana Image Generation (Gemini)
Best for: Text-heavy production assets, explicit Nano Banana requests, and high-resolution refinements.
Models:
model="pro"(default): Gemini 3 Pro — best quality, supports Google Search groundingmodel="flash": Gemini 2.5 Flash — faster and cheaper
Key Capabilities:
- Text in images (headlines, labels, UI elements)
- High-resolution output (1K, 2K, 4K)
- Various aspect ratios (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
- Image-to-image with
reference_images - Real-world grounding via
use_google_search=True
Seedream 4.5 Image Generation (ByteDance via fal.ai)
Best for: Product photography, marketing assets, text rendering in images, material / fabric preservation, scenes with accurate spatial depth.
Key Capabilities:
- Fast generation (~2-3 seconds)
- Resolutions up to 4K (
auto_2K,auto_4K, or preset sizes) - Photorealistic quality with strong material / texture detail
- Size presets:
square_hd,square,portrait_4_3,portrait_16_9,landscape_4_3,landscape_16_9,auto_2K,auto_4K
seedream_image_generation(
requests=[{"prompt": "Product shot of leather handbag on marble surface"}],
image_size="auto_2K"
)
Nano Banana Image Edit
Best for: Modifying single existing images.
- Targeted changes to uploaded images
- Same quality options as generation
Nano Banana Multi-Turn
ONLY use when the user explicitly says "use multi turn image editing".
mode="sync": Chain mode — output from turn N becomes input for turn N+1mode="parallel": Independent mode — all turns processed independently
Request Format
All tools use the requests parameter:
requests=[
{"id": "optional_id", "prompt": "Description", "reference_images": ["optional"]}
]
Common Patterns
| Pattern | Tool | Notes |
|---|---|---|
| Website-based ad creative | firecrawl_extract_branding → openai_image_edit | Extract branding first, then create the first branded concept |
| Ad with text | nano_banana_image_generation(model="pro") | Use when readable text inside the image is a core requirement |
| Quick concept | openai_image_generation | Simple, no constraints |
| Style transfer | Nano Banana + references | Image-to-image |
| Print-quality poster | Nano Banana + 4K | High resolution |
| Product photography | seedream_image_generation | Photorealistic, material detail |
| Edit existing image | nano_banana_image_edit | Single image modification |
| Batch generation | Multiple requests | Parallel generation |
Pattern H: Brand-Consistent Ad Creative
- Call
firecrawl_extract_brandingwith the website URL - Read the returned colors, fonts, personality, and logo
file_id - Write a prompt that includes the actual hex colors, font names, and tone from the result
- Pass the logo
file_idas a reference image - Generate the first pass with
openai_image_editwhen using brand references, oropenai_image_generationfor loose ideation - Use
nano_banana_image_generation(model="pro")only if the image itself needs strong readable text rendering
Important Reminders
- DO NOT display image URLs to the user — they are automatically shown in chat
- Refine vague prompts — unless the user explicitly wants verbatim generation
- Match aspect ratio to intent — social media, print, web, etc.
- Use appropriate resolution — 4K for production, 1K / 2K for iteration
- Remember
file_ids — generated images can be reused in subsequent operations - For website-based brand work — call
firecrawl_extract_brandingbefore generating ad creatives - For initial ad creatives — prefer OpenAI tools first; reserve Nano Banana Pro for explicit text-heavy assets or explicit user preference