Thumbnail Creator Skill (via Gemini)
Generates article and newsletter thumbnail candidates by acting as an image-generation agent inside Claude Code. Instead of switching between tools and prompting Gemini's web UI one image at a time, this skill makes Claude do the full loop: read the copy, propose compositions, write tailored prompts, call the Gemini API, evaluate the outputs, and return ranked results with brief rationale.
The output is production-ready thumbnail candidates you can drop directly into your CMS, newsletter tool, or social scheduler.
Prerequisites
Both of these must be in place before the skill can generate images:
1. Gemini API Key
Get a free key from Google AI Studio.
Set it as an environment variable:
export GEMINI_API_KEY="your-key-here"
To persist it across sessions, add to your shell profile (~/.zshrc or ~/.bashrc):
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.zshrc
source ~/.zshrc
Verify it is set:
echo $GEMINI_API_KEY
2. generate_image.py Script
This script must exist at ./generate_image.py in the project root. The full template is provided in the Script Template section below. Claude will check for it and offer to create it if missing.
Python dependencies:
pip install google-generativeai Pillow requests
Or with uv:
uv pip install google-generativeai Pillow requests
Required Inputs
Claude will ask for these if not provided:
| Input | Required | Notes |
|---|---|---|
| Article copy or URL | Yes | Paste the full article text, or provide a URL to fetch. Used to extract themes, hooks, and key claims for composition. |
| Brand colours | Recommended | Hex codes or descriptive names. E.g. #1A1A2E (navy), #E94560 (coral). If not provided, Claude uses clean neutral defaults. |
| Fonts / type style | Recommended | E.g. "bold sans-serif", "editorial serif", "Neue Haas Grotesk". Used in prompt to guide text treatment. |
| Style reference description | Recommended | E.g. "flat illustration, minimal, like Stripe's marketing site" or "photorealistic, dark background, high contrast". A style image URL can also be provided. |
| Output dimensions | No | Defaults to 1792x1024 (landscape, standard article thumbnail). Options: 1024x1024 (square), 1024x1792 (portrait/mobile). |
| Number of candidates | No | Defaults to 4. Min 1, max 8 (API limits and cost). |
| Article title (if different from H1) | No | Used as the primary text element in image prompts. |
| Candidate selection | No | After proposing compositions, Claude asks which to generate. User can say "all" or pick by number. |
Output Structure
Phase 1 — Composition Proposals (text, before any API calls)
Claude presents 3-4 composition concepts for user approval. Format:
Composition Concepts for: "[Article Title]"
1. BOLD CLAIM
Layout: Full-bleed dark background, large white headline centred,
single accent data point (e.g. "3x faster") in brand colour below
Mood: High authority, newsletter-style
Best for: LinkedIn, Substack headers
Rationale: The article's central claim ("X outperforms Y by 3x") is specific
enough to anchor the visual — readers stop on data.
2. CONCEPTUAL OBJECT
Layout: Central object illustration (e.g. a broken clock for a time-waste article),
title in upper third, minimal texture background
Mood: Editorial, Medium-style
Best for: Blog header, Medium cover, email preheader
Rationale: Gives art directors visual metaphor flexibility; works across sizes.
3. CONTRAST SPLIT
Layout: Left half brand colour, right half white or image,
title on colour side, supporting subtext on white side
Mood: Clean, professional, startup-brand feel
Best for: Newsletter, LinkedIn carousel first slide
Rationale: Split layout performs consistently in newsletter A/B tests;
text is readable at small sizes.
4. TYPOGRAPHIC ONLY
Layout: No illustration, oversized title treatment,
author name in small caps at bottom, thin rule separator
Mood: Premium, confident, editorial
Best for: Substack, Ghost, high-density email lists
Rationale: Works when the brand has strong type identity. Fastest to produce.
Which compositions do you want generated? (Reply with numbers, e.g. "1, 3" or "all")
Phase 2 — Generated Image Files
After generation, Claude saves files to ./thumbnails/[article-slug]/:
thumbnails/
└── article-slug-from-title/
├── candidate_01_bold_claim.png
├── candidate_02_conceptual_object.png
├── candidate_03_contrast_split.png
├── candidate_04_typographic.png
└── evaluation_report.md
Phase 3 — Evaluation Summary Table
Claude evaluates each returned image via computer vision and produces:
Thumbnail Evaluation — "[Article Title]"
Generated: 2026-05-27 | Model: Gemini Imagen | Dimensions: 1792x1024
| # | Candidate | Composition | Brand Fit /10 | Text Legibility /10 | Recommendation |
|---|---|---|---|---|---|
| 1 | candidate_01_bold_claim.png | Bold Claim | 9 | 8 | ★ Top pick — strong data anchor, brand colours correct, title readable at 200px width |
| 2 | candidate_02_conceptual_object.png | Conceptual Object | 7 | 9 | Good fallback — legible, clean, but illustration style drifted slightly from brand |
| 3 | candidate_03_contrast_split.png | Contrast Split | 8 | 7 | Works well at full size; test at thumbnail size before publishing — right side text tightens |
| 4 | candidate_04_typographic.png | Typographic | 9 | 10 | Strongest for email — zero brand drift risk, completely text-based |
Recommended for web: candidate_01_bold_claim.png
Recommended for email/mobile: candidate_04_typographic.png
Recommended for social: candidate_03_contrast_split.png
Files saved to: ./thumbnails/article-slug-from-title/
How Claude Should Execute This Skill
Step 1 — Ingest and analyse the article
Accept article copy as pasted text or a URL.
If a URL is provided, fetch the page and extract:
- The H1 title
- The first 3-5 paragraphs (the hook, central claim, and key points)
- Any notable statistics or named frameworks mentioned
- The author name (for typographic compositions)
If text is pasted, read it directly. Focus on:
- The hook: What is the opening claim or tension?
- The central thesis: What is the one thing the article argues or teaches?
- Key specifics: Any numbers, named frameworks, or concrete examples that could anchor a visual
- Tone: Is this formal/authoritative, conversational/accessible, provocative/challenge-based?
Summarise these findings internally before proposing compositions — the proposals should feel tailored to this specific article, not generic.
Step 2 — Collect brand specs
Ask the user for brand specs if not provided:
To generate on-brand thumbnails, I need a few details:
1. Brand colours (hex codes or descriptions) — e.g. #1A1A2E, #E94560
2. Font style preference — e.g. "bold sans-serif", "editorial serif", "geometric"
3. Visual style — e.g. "flat minimal", "photorealistic", "illustrated", "typographic only"
4. Any style references — describe a brand or publication whose aesthetic you want to match,
or share an image URL
If you don't have brand specs yet, say "use clean defaults" and I'll use a professional
dark-on-white editorial style.
If the user says "use clean defaults", apply:
- Background:
#FFFFFFor#0F0F0F(dark mode default) - Accent:
#2563EB(blue) - Font style: bold geometric sans-serif
- Style: minimal flat, no textures, high contrast
Step 3 — Propose composition concepts
Write 3-4 composition concepts tailored to the article's tone and content. Each concept must:
- Have a name (short, memorable label)
- Describe the layout precisely (where title goes, what visual element anch