/image-gen — On-Brand Image Generation

Describe what you need. Get an image that looks like your brand made it.

This skill reads your visual brand identity from brand/creative-kit.md, crafts a narrative prompt that bakes in your style constraints, and generates the image via Gemini API. No brand style defined yet? It still works — just at a lower enhancement level. Run /visual-style first for the best results.

On Activation

Check GEMINI_API_KEY environment variable.
- If missing: "Image generation requires a Gemini API key. Set GEMINI_API_KEY in your environment. Get one at ai.google.dev."
- Do not proceed without it.
Read brand files in priority order:
- brand/creative-kit.md — look for ## Visual Brand Style section
- brand/voice-profile.md — personality informs image tone
- brand/positioning.md — angles inform visual metaphors
- brand/landscape.md — Claims Blacklist (don't generate imagery that visually implies blacklisted claims)
Determine enhancement level:

Level	Context available	Image quality
L0	No brand files	Good generic images — uses discovery questions + prompt craft
L1	voice-profile.md	Personality-aligned — playful brand gets warm/bright images
L2	+ creative-kit.md colors/typography	Color-constrained — palette woven into prompts
L3	+ Visual Brand Style section	Fully on-brand — style anchors, lighting, mood, composition all applied

Phase 1: Discovery

Use AskUserQuestion. Ask one at a time. Skip questions the user already answered in their request.

Question 1: Purpose "What's this image for?"

Blog header / article illustration
Social media post (which platform?)
Product shot / marketing asset
Hero image / landing page
Presentation / slide deck
Something else (describe it)

This determines aspect ratio:

Use case	Default ratio	Resolution
Blog header	16:9	2K
Social square	1:1	2K
Social story	9:16	2K
Hero / banner	21:9	2K
Product shot	4:3	4K
Thumbnail	16:9	1K

Question 2: Feeling "What should someone feel when they see this?" (Free text — this drives the prompt's emotional anchor)

Question 3: Style override (only if L3 brand style exists) "Use your on-brand style, or something different?"

On-brand (default) — applies Visual Brand Style from creative-kit.md
Different — describe the style you want instead

If user picks "different," their description overrides the brand style for this image only.

Skip discovery when: The user's request is specific enough. "Generate a 16:9 blog header showing a glowing terminal on a dark background, warm rim lighting" — don't ask what they want, they just told you.

Phase 2: Prompt Crafting (Nano Banana Prompt Engineering)

You are an expert Nano Banana prompt engineer. Your job is to turn the user's brief into a single, high-quality prompt for Nano Banana 2 (or Pro), a "thinking" image model used for professional asset production.

Core principle: brief a senior art director, don't list keywords. Write natural language in full sentences. Never use "tag soup" like "dog, park, 4k, realistic."

The 10 Rules

1. General style. Be specific and descriptive about subject, setting, composition, camera/viewpoint, lighting, mood, materials, and textures. Full sentences, not comma lists.

2. Context and purpose. Always encode the purpose and audience (YouTube thumbnail, app icon, hero banner, tweet graphic, 4K wallpaper). Let purpose guide style, polish level, and framing.

3. Text and infographics. If text must appear, put it clearly in quotes in the prompt. Ask for legible, clean typography and specify style (bold sans-serif, monospace, handwritten). For data, ask the model to compress into infographics, diagrams, or whiteboards.

4. Character and brand consistency. When reference images exist, explicitly refer to them: "Keep the person's facial features exactly the same as Image 1." Allow changes in pose, expression, angle while preserving identity.

5. Grounding and realism. For real data, locations, or products, tell the model to rely on up-to-date factual knowledge. Encourage coherent details consistent with physics.

6. Editing and restoration. For edits to existing images, give semantic instructions: "remove," "replace," "add," "restore," "change the season." Maintain original structure, only change what's intended.

7. Dimensional and structural control. For floor plans, schematics, wireframes, grids, tell the model to follow that layout closely. For 2D↔3D, describe how the new representation should look while preserving key relationships.

8. Resolution, detail, and format. Specify resolution ("high detail suitable for 4K wallpaper," "clean 16:9 thumbnail"). Call out micro details and textures when needed (brushed steel, cracked paint, mossy stone).

9. Narrative and sequences. For multiple images, describe the story arc, emotional beats, what stays consistent across images. Specify count, format, and identity/style consistency.

10. Output rules. Do not ask follow-up questions about the prompt. Resolve small ambiguities with sensible professional defaults. Output a single flowing narrative prompt.

Prompt Structure

Build the narrative in this order, woven into flowing prose:

Purpose and format — what this is for, aspect ratio, resolution
Scene — what's happening, where, environment
Subject — detailed description with textures, materials, poses
Composition — framing, focal point, depth of field, negative space
Lighting — source, quality, color temperature, interaction with materials
Mood — emotional tone, atmosphere
Text — any text in quotes with typography specification
Technical — camera/lens for photorealistic, style reference for illustrated

Brand Constraints (L3)

When Visual Brand Style exists, weave constraints INTO the narrative — don't add as a separate block:

Primary Aesthetic → sets overall style direction
Lighting → overrides generic lighting with brand-specific lighting
Backgrounds → constrains background treatment
Composition → constrains layout/framing
Mood → anchors emotional tone
Avoid → explicit exclusions baked into prompt
Reference Prompts → use as structural templates, adapting subject matter

See references/prompt-patterns.md for proven patterns. See references/visual-metaphors.md for concept-to-metaphor mapping.

Phase 3: Generate

Gemini API Call

import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

response = client.models.generate_content(
    model="gemini-3.1-flash-image-preview",
    contents=["<narrative prompt>"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="<ratio>",
            image_size="<resolution>",
        ),
    ),
)

for part in response.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = part.as_image()
        image.save("output.png")

Default to gemini-3.1-flash-image-preview (Nano Banana 2). Launched Feb 26, 2026. Pro quality at Flash speed/pricing. 4K support, up to 14 reference images for style consistency. Use gemini-3-pro-image-preview (Nano Banana Pro) only when text-heavy infographics or premium quality justify the higher cost.

Save and Log

Save image to project directory (e.g., images/, assets/, or wherever the project keeps visuals)

Append to brand/assets.md:

| <date> | image | <file-path> | image-gen | <1-line description of what was generated> |

Phase 4: Iterate

After generatin

image-gen

How to add

Drop this on your repo README

Related skills

learn-codebase

remove-deadcode

apify-competitor-intelligence

ad-creative

Get new Marketing skills every Monday