Local Image Generator

Generate images locally using Stable Diffusion. Auto-detects your hardware and picks the optimal model, device, and resolution.

Phase 0: Detect Compute Environment

Run this at the start of every invocation. It determines everything downstream.

python3 -c "
import platform, shutil, subprocess, json

info = {'os': platform.system(), 'arch': platform.machine(), 'ram_gb': 0, 'gpu': 'none', 'vram_gb': 0, 'device': 'cpu', 'dtype': 'float32'}

# RAM
try:
    if platform.system() == 'Darwin':
        import os; info['ram_gb'] = round(os.sysconf('SC_PAGE_SIZE') * os.sysconf('SC_PHYS_PAGES') / (1024**3))
    elif platform.system() == 'Linux':
        with open('/proc/meminfo') as f:
            for line in f:
                if line.startswith('MemTotal'):
                    info['ram_gb'] = round(int(line.split()[1]) / (1024**2))
                    break
    else:
        import ctypes
        mem = ctypes.c_ulonglong(0)
        ctypes.windll.kernel32.GetPhysicallyInstalledMemory(ctypes.byref(mem))
        info['ram_gb'] = round(mem.value / (1024**2))
except: pass

# GPU detection
try:
    import torch
    if torch.cuda.is_available():
        info['gpu'] = torch.cuda.get_device_name(0)
        info['vram_gb'] = round(torch.cuda.get_device_properties(0).total_mem / (1024**3))
        info['device'] = 'cuda'
        info['dtype'] = 'float16'
    elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
        info['gpu'] = 'Apple Silicon (MPS)'
        info['vram_gb'] = info['ram_gb']  # unified memory
        info['device'] = 'mps'
        info['dtype'] = 'float16'
    elif hasattr(torch, 'hip') or 'AMD' in str(getattr(torch, '_C', '')):
        info['gpu'] = 'AMD (ROCm)'
        info['device'] = 'cuda'  # ROCm uses cuda API
        info['dtype'] = 'float16'
except ImportError:
    pass

print(json.dumps(info))
"

Parse the JSON output and store it internally as COMPUTE. Present the results to the user:

Detected hardware:

OS: {os} ({arch})

RAM: {ram_gb} GB

GPU: {gpu} ({vram_gb} GB VRAM)

Compute device: {device}

Model Selection Matrix

Based on the detected hardware, recommend a model from this table:

Condition	Recommended Model	Reason
VRAM >= 8 GB (CUDA or MPS)	`stabilityai/sdxl-turbo`	Best quality, fast with GPU
VRAM 4-7 GB (CUDA)	`stabilityai/sd-turbo`	Lighter model, fits in low VRAM
VRAM < 4 GB or CPU + RAM >= 16 GB	`stabilityai/sd-turbo` + CPU offload	Slow but works
CPU + RAM < 16 GB	`segmind/tiny-sd`	Smallest model, runs on anything

Present the recommendation and let the user choose:

AskUserQuestion: "Which image generation model should I use?"
Options:
- [Recommended model] (Recommended) — [reason based on their hardware]
- SDXL-Turbo — Best quality, needs 8+ GB VRAM, ~6 GB download
- SD-Turbo — Good quality, needs 4+ GB VRAM, ~3 GB download
- Tiny-SD — Lower quality, runs on any hardware, ~1 GB download

Store the chosen model as MODEL.

Resolution Selection

Based on model and VRAM:

VRAM	SDXL-Turbo	SD-Turbo	Tiny-SD
>= 16 GB	1200x640	1200x640	768x408
8-15 GB	1024x576	1200x640	768x408
4-7 GB	N/A	768x408	512x272
CPU	N/A	512x272	512x272

Steps Selection

Device	SDXL-Turbo	SD-Turbo	Tiny-SD
CUDA	4-6	4-6	20-30
MPS	6	6	25
CPU	6-8	6-8	30-40

Phase 1: Determine What to Generate

If the user provided a slug and prompt (argument after the skill name), parse them and skip to Phase 3.

Expected argument format: {slug} {prompt} (e.g., beginners-guide-to-rag abstract knowledge retrieval system with floating documents)

If only a slug was provided, read the blog post to generate an appropriate prompt:

# Find the blog post
cat content/blog/{SLUG}.mdx 2>/dev/null | head -50

Extract the title, description, and key themes. Generate a prompt using one of the SAI style templates below that best fits the post's topic. Each image MUST use a different style to avoid visual repetition across blog posts.

SAI Style Templates (pick ONE per image)

Each template wraps your subject description in a distinct visual style. Replace {subject} with a short, vivid description of the post's core concept as a visual metaphor.

Style	Template	Best for
Isometric	`isometric style {subject}. vibrant, beautiful, crisp, detailed, ultra detailed, intricate`	Architecture, systems, infrastructure
Low-poly	`low-poly style {subject}. low-poly game art, polygon mesh, jagged, blocky, wireframe edges, centered composition`	Tutorials, beginner guides, fundamentals
Neonpunk	`neonpunk style {subject}. cyberpunk, vaporwave, neon, vibrant, stunningly beautiful, crisp, detailed, sleek, ultramodern, magenta highlights, dark purple shadows, high contrast, cinematic`	AI/ML, cutting-edge tech, future-facing
Concept art	`concept art {subject}. digital artwork, illustrative, painterly, matte painting, highly detailed`	Opinion pieces, deep dives, strategy
Line art	`line art drawing {subject}. professional, sleek, modern, minimalist, graphic, line art, vector graphics`	Comparisons, frameworks, decision guides
3D model	`professional 3d model {subject}. octane render, highly detailed, volumetric, dramatic lighting`	Product/tool reviews, practical guides
Fantasy	`ethereal fantasy concept art of {subject}. magnificent, celestial, ethereal, painterly, epic, majestic, magical`	Vision pieces, thought leadership
Cinematic	`cinematic film still {subject}. shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous`	Case studies, real-world stories

Subject Description Guidelines

Write the {subject} as a vivid visual metaphor, not a literal description. Never include hands, fingers, faces, or human figures.

Good: "a crystalline data pipeline splitting light into rainbow streams"
Bad: "data pipeline architecture diagram"
Good: "mechanical clockwork gears meshing with glowing circuit traces"
Bad: "AI system with nodes and connections"

Vary across posts: color palette, physical metaphor (clockwork, rivers, crystals, bridges, constellations), and composition.

If no input was provided, use AskUserQuestion to ask for a slug and description.

Phase 2: Confirm with User

Present the generation plan:

I'll generate a hero image for {slug}: Model: {MODEL} on {device} Resolution: {width}x{height} ({steps} steps) Prompt: "{prompt}" Seed: {seed or "random"} Estimated time: {estimate based on device and model}

Want me to adjust anything before generating?

Time estimates:

Device	SDXL-Turbo	SD-Turbo	Tiny-SD
CUDA (RTX 3060+)	5-10s	3-8s	15-25s
MPS (M1/M2/M3/M4)	25-35s	15-25s	30-45s
CPU (16GB+ RAM)	3-8 min	2-5 min	5-10 min

Phase 3: Install Dependencies

Check and install what's needed based on platform:

# Check Python + torch
python3 -c "import torch; print(torch.__version__)" 2>&1

If torch is missing, install based on platform:

Platform	Install command
macOS (MPS)	`pip3 install torch torchvision`
Linux (CUDA)	`pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu121`
Linux (ROCm)	`pip3 install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.0`
Linux/Windows (CPU)	`pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cpu`
Windows (CUDA)	`pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu121`

Then install diffusers:

local-image-gen

How to add

Drop this on your repo README

Related skills

claude-api

skill-creator

claude-mem

oh-my-issues

Get new Desenvolvimento skills every Monday