SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

ai-shorts-pipeline

Design e Frontend

Daily video production for anyone too busy to record and edit but whose audience still expects to hear from them. Orchestrates Remotion + ElevenLabs + Vertex Veo + Three.js as a single Claude Code skill, with a built-in marketing HQ dashboard.

1estrelas
Ver no GitHub ↗Autor: KhizergenfoxLicença: MIT

name: Daily AI Shorts description: A Claude Code skill for anyone whose calendar is full but whose audience still expects to hear from them. Automates the production of daily 60-second vertical videos — voiceover, scene composition, B-roll, optional avatar, captions — so consistency stops being a tax. Ready to post on YouTube Shorts, Instagram Reels, LinkedIn, and X. type: skill license: MIT status: active

Daily AI Shorts

For anyone whose calendar is full but whose audience still expects to hear from them.

Daily content is now a tax on every operator, founder, and builder trying to grow a name. Most days you don't have an hour to record, edit, caption, and post. This skill automates the production so you can keep showing up — even on the days you can't. The point is consistency for the people whose real job isn't creating content but who still need to.

Drop this skill into Claude Code. Give it a topic and a few sources. It hands you back a finished 9:16 video, a voiceover, and per-platform captions. No editor. No camera. Around 5–7 minutes per video on a normal laptop.

Battle-tested on @AIinBusiness — daily videos rendered by this exact pipeline.


What this skill does

When you give Claude a topic and a few links, it:

  1. Drafts a 60-second narration script
  2. Generates voiceover via ElevenLabs
  3. Aligns word-level timestamps against the audio
  4. Composes scenes via Remotion (real screenshots, native UI mockups, Three.js, Veo b-roll)
  5. Renders the final 1080×1920 MP4
  6. Writes platform-specific captions for YouTube, Instagram, LinkedIn, and X with character-limit awareness
  7. Drops everything into a local marketing dashboard for review and upload

You bring the take. The skill does the production.


The one question that drives every cut

"What is the viewer asking right now, and what visual answers it fastest?"

Every cut answers a question the script just planted in the viewer's head. Decorative cuts — shader backgrounds, generic stock footage, repeated 3D scenes — answer no question. They cost retention. The skill enforces this by mapping each beat in the spec to a specific visual job, and refuses to render a scene that doesn't have one.


The 8 editing rules baked into the pipeline

  1. Hook with a real source. First 1–3 seconds is a real news article screenshot or a dramatic claim. Never a talking head, never a title card.

  2. Literal before metaphorical. A real-object reference first (the actual chip, the actual screenshot, the actual headline). Abstract 3D illustration only after literal context is established. Earn the metaphor.

  3. Slow-fast-slow rhythm. Hook = fast cuts (1.5–2s). Setup = slower (2.5–3s). Reveal = fast (1–2s). Conclusion = slower (2.5–3s). CTA = punchy (1.5s).

  4. Cut on the question, reveal on the answer. Hold half a second of silence between question beat and answer beat. Tension creates emphasis.

  5. Zoom on every reveal. Stat reveals, shock beats, name reveals all get a zoom: { fromScale: 0.94, toScale: 1.04 }. Adds energy without adding cuts.

  6. Three-second max per static frame. No held visual longer than 3s. Either hard-cut to a new visual, or have meaningful internal motion.

  7. Captions are off by default. Pill-style captions compete with the visuals. Enable only when you want them.

  8. Sound is half the video. Synthesized SFX on every cut, varied by beat type. Music bed is optional, ducked under narration.


Pacing curve (applied to every script)

0–3s     Hook            Fast cuts (1.5–2s)         Big claim, real source
3–15s    Setup           Slow cuts (2.5–3s)         Build context, comparisons
15–40s   Reveal/stack    Fast cuts (1–2s)           Evidence stacking, big stats
40–55s   Conclusion      Slow cuts (2–3s)           Emotional weight, action items
55–61s   CTA             Punchy (1–1.5s)            Handle, follow line

A 60-second video at ~2.1s average per cut works out to 27–30 cuts. Fewer than that and the video feels static. More than that and it feels frantic.


Tech decision rubric

When the script plants a question, the skill picks the visual using this map. It is opinionated by design.

Question the line raisesVisual the skill picks
"Is this claim real / sourced?"News article mockup (TechCrunch / Bloomberg / NY Post styles, with black-block highlights)
"What does this person look like?"Imagen → Veo (text-described, not photo-fed; non-celebrity only)
"Where is this happening physically?"Veo text-to-video (cinematic B-roll: fab, server-rack, skyline)
"How big? How much?"Stat reveal (giant number on black)
"How does X compare to Y?"Three.js story_3d scene (math_race, cost_8x, etc.)
"What's the brand connection?"Headline card (logo + Bloomberg lower-third)
"What are the three things to do?"Bullet list, one item per beat, with a giant numeral
"Pivot / contradiction / quiet emphasis"Minimal text on pure black (use sparingly — max 3 per video)

Beat-boarding workflow

For every new video, the skill follows this sequence:

  1. Read the script aloud. Mark sentence breaks. Note the emotional arc.
  2. Identify the 5–6 key claims that need the most visual support.
  3. Sketch the beat-board as a Markdown table: # | Time | Spoken | Visual | Tech | Why this tech. Show the user before rendering. Wait for green light.
  4. Phrase-anchor each beat in build-cuts-spec.mjs. Sequential matching ensures repeated phrases attach to the right occurrence.
  5. Generate fresh assets for the topic (Veo b-rolls, article templates, Three.js variants only if existing ones don't fit).
  6. Render to out/<topic>-vN.mp4, iterate, then move final to out/final/<topic>-final.mp4 when approved.

The beat-board is the contract. It's the artifact you review BEFORE rendering, so you don't waste the 5–7 minutes.


The mandatory self-QA pass

Before reporting any render to the user, the skill extracts one frame per beat (not "every two seconds") and reads every single one against the spec. No skipping. No sampling.

mkdir -p out/frames-<topic>-vN
node -e "
  const s = require('./public/sessions/<id>/spec-cuts.json');
  let cum = 0;
  s.scenes.forEach(sc => {
    const mid = (cum + sc.durationSeconds / 2).toFixed(2);
    console.log(sc.id, mid);
    cum += sc.durationSeconds;
  });
" | while read id mid; do
  ffmpeg -y -ss $mid -i out/<topic>-vN.mp4 -vframes 1 -q:v 3 \
    "out/frames-<topic>-vN/${id}.jpg" 2>/dev/null
done

For each frame the skill checks:

  • Does the visual answer the spoken word at this timestamp?
  • Is text actually readable? (Three.js cannot render text via primitives — always HTML overlay.)
  • Is branded content recognisable? (Real SVG logo via texture, not abstract primitives.)
  • Is the layout properly framed? (Cards centered, no empty regions, camera anchored.)
  • Are SFX firing on the right beats?

Findings are documented before the user is asked to look. Never "done" without a QA report.


What the skill will not do

These are mistakes that cost retention and trust. The skill enforces them by refusing to render or flagging them in QA.

  • ❌ Decorative shader backgrounds with no narrative job.
  • ❌ The same 3D scene on two adjacent beats.
  • ❌ Photo-fed Veo for known figures (produces AI-stiff animation; always Imagen seed → Veo image-to-video).
  • ❌ Audio left unmuted on Veo clips (fights the master narration).
  • ❌ Generic atmospheric prompts ("a data center"). Always specify lens, grade, reference, motion.
  • ❌ Captions over the visuals by default.
  • ❌ All three bullet items on one card. Each enumerated item gets its own beat with a giant numeral.
  • ❌ Ken Burns zoom on everything. Reveal cuts only.
  • ❌ The same SFX on every cut.
  • ❌ Three.js boxGeometry to "fake" letters. Boxes do not look like letters. Always use HTML/CSS overlay f

Como adicionar

/plugin marketplace add Khizergenfox/ai-shorts-pipeline

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.