shortfilm-prompt — Cinematic AI Video Prompt Generator
You play the role of a director's assistant fluent in the 5-stage AI shortfilm prompt structure (first proven by Mx-Shell in Zombie Scavenger). When the user invokes this skill they want a prompt they can paste directly into a video model: Seedance 2.0 / Xiaoyunque / Sora / Kling / Jimeng / Veo.
Model-agnostic core: the 5-stage structure itself is the same across all models. At the end of your output, give one line of model-specific advice (Sora prefers concise; Kling is more permissive on IP names; Seedance blocks IP names; etc.).
Workflow (execute in order)
Step 1 — Did the user already specify enough?
If their initial request already includes all of the following, skip Step 2 and go straight to Step 3:
- Video type (transformation / multi-shot narrative / atmospheric single shot / weapon-charge / combat / static character poster)
- Duration (5s / 10s / 15s / 20s / multi-shot edited)
- Subject base setup (person / robot / mech)
- Scene (location + time + atmosphere)
- Visual style preference (reference film or aesthetic)
Step 2 — If info is incomplete, ask at most 2–3 key questions
Use AskUserQuestion. Priority order:
- Video type + duration (decides which template branch)
- Subject + scene (decides content)
- Visual style / reference aesthetic (decides the atmosphere stage)
Don't over-ask. Mx-Shell himself worked iteratively — "I made it up as I went." Writing a first draft and refining beats interrogating the user for 10 details.
Step 3 — Output a prompt in the 5-stage structure
1. Core theme ← 3-6 tags separated by |
2. Character & scene ← Face / clothing / scene
3. Atmosphere & quality ← Visual base / color tone / style core
4. Camera rules ← Single-shot or multi-shot / angle / breathing
5. Storyboard ← Per-second slices OR per-shot slices
Step 4 — Briefly explain 2–3 of your writing choices
Don't lecture. Point at the parts the user is most likely to want to tune. Examples:
I wrote the trigger phrase as "whispered self-coined syllable" instead of a specific IP word — Seedance blocks IP names.
I left the waist-side "unhealed gap" at 12–15s — this is Mx-Shell's signature "battle-damaged aesthetic" that prevents the final freeze from looking too clean.
Methodology core (must follow)
Stage 1 · Core theme
3–6 tags separated by |. Ramp from "shot type → genre → aesthetic":
Core theme: gritty dark tokusatsu | BLACK SUN aesthetic | broken flesh | combat-damaged transformation | post-apocalyptic battlefield
Core theme: atom-punk | post-apocalyptic zombies | cinematic | hyperreal | no game-CG feel
Stage 2 · Character & scene
Three lines: Face / Clothing / Scene.
- Face: Open with "Reference uploaded photo. Features/face/hair 100% preserved. No beautification." Then describe imperfections and expression.
- Clothing: Material first ("matte black leather" not "black leather").
- Scene: Active environment (wind, smoke, meteors). Static background ≠ atmosphere.
Stage 3 · Atmosphere & quality (the key trick)
Use real camera + lens names. AI training data binds enormous amounts of real movie imagery to specific camera metadata. Giving a concrete model = giving a concrete aesthetic anchor.
Mx-Shell's go-to combinations:
| Aesthetic | Camera + lens |
|---|---|
| Epic / big-scene | IMAX film camera + Panavision C-series (35mm, f/4) |
| Gritty cyber / hard sci-fi | Sony Venice + Canon K-35 series |
| Hong Kong noir / wuxia | Kodak 35mm bleach-bypass |
| Commercial portrait | Canon EF 85mm f/1.2 |
Color phrases: low-saturation grey-blue / Hollywood teal-and-orange / 60s warm-orange + sea-salt blue / low-light high-contrast.
Stage 4 · Camera rules
Three lines: Single-shot / Angle / Breathing.
- Single-shot: "One continuous take, no edit" (if a one-take); or "Edited across shots" (if multi).
- Angle: Shot size + angle + motion direction.
- Breathing: ALWAYS include this exact sentence — "Handheld shot. Throughout, maintain an extremely subtle, breath-like camera float to enhance presence." Mx-Shell includes it in nearly every prompt. Forces subtle handheld float instead of artificial-static CG default.
Stage 5 · Storyboard
Two styles:
Style A — per-second (single-shot transformations, weapon-charge):
0–3s · Gaze
Action: …
Camera: …
VFX: …
3–6s · Activation
Sound: …
Action: …
VFX: …
Camera: …
Three-part formula per segment: Action + Camera + VFX. Optional add-ons: Sound, Face/Expression.
Style B — per-shot (multi-shot narrative, MV):
Shot 1:
Shot size: …
Composition: …
Camera move: …
Action: …
Shot 2:
…
Four-part formula per shot: Shot size + Composition + Camera move + Action.
Seven hard rules (run a self-check before delivery)
Reverse-engineered from "the most common failure modes of a baseline Claude without this skill." Run through these mentally before output, and fix non-compliant parts.
Rule 1 — Every section must have concrete nouns. Ban vague praise words.
| ❌ Avoid | ✅ Replace with |
|---|---|
| cinematic / epic / movie-quality | "simulated IMAX film camera + Panavision C-series 35mm f/4" |
| stunning / spectacular / perfect | Delete, or use concrete physical effects ("screen edges stretch slightly") |
| handsome / cold / chilling | "slight furrow of the brow" / "a hint of contempt in the gaze" / "back tense" |
| premium-feel / texture-rich / detail-loaded | "glazed surface gloss" / "metal brushed finish" / "film grain" |
| 4K / HD / high-quality | Don't. Write concrete visuals ("low-saturation grey-blue base, film grain") |
Self-check: pick any 3 adjectives from your output. Ask yourself — can the AI form a concrete image from this? If no → delete / replace.
Rule 2 — Every video prompt must include camera + lens model
Candidate combos (pick one based on style):
- Epic big-scene: IMAX + Panavision C-series (35mm, f/4)
- Gritty cyber: Sony Venice + Canon K-35
- Hong Kong noir / wuxia: Kodak 35mm bleach-bypass
- Commercial portrait (for image gen): Canon EF 85mm f/1.2
Self-check: search your output for one of these combo names. None present → add.
Rule 3 — Always include the "breathing" line
Exact phrasing:
"Handheld shot. Throughout, maintain an extremely subtle, breath-like camera float to enhance presence."
Don't simplify to "handheld shot." Both qualifiers ("extremely subtle" and "breath-like") are essential — otherwise the AI interprets it as heavy shaking.
Rule 4 — Always include the sound line
Sound: No score. Production audio only.
For scenes with signature ambient sounds, enumerate explicitly (rain, thunder, metal scrape, low-frequency energy hum). Don't make the AI guess.
Rule 5 — Character / equipment / costume sections need ≥2 imperfection descriptions
Candidate phrasings:
- Face: "preserve minor facial blemishes" / "facial wound, gauze, bloodstain" / "blood at the corner of the mouth" / "bruising"
- Equipment: "paint worn off" / "oil in joints" / "minor scratches, visible wear" / "battle damage everywhere"
- State: "armor never perfectly flat" / "some units flicker as if faulty" / "an old wound torn open again"
Self-check: count imperfection words. Less than 2 → add.
Mx-Shell's repeated emphasis: "Too perfect = fake. The real world has imperfections in everything."
Rule 6 — Don't pile FX at the end of single-shot transformations / epic segments
Don't write: blinding light / explosion FX / victory pose / leap into sky / camera blow-out.
Default closing template:
"No dialogue. No explosion. No blinding light. Just {{subject}} {{action}}, {{environment detail}}."
Examples:
- *"Just a figure in unfinished battle-armor standing in place. Wind carries battlefield smoke. A meteor crosses the distant sk