SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

fb-ad-video-studio

Marketing

Produce high-converting Facebook/Instagram/TikTok ad VIDEOS — motion-graphics spots and talking-head founder ads — by authoring HyperFrames HTML compositions and rendering to MP4. Opinionated layer on top of the HyperFrames framework that bakes in a battle-tested ad structure, the audio pipeline (VO + SFX + silence-cutting), caption sync, and a reverse-template workflow. Use whenever the user asks

4estrelas
Ver no GitHub ↗Autor: ai-agents-for-agencies-coachesLicença: MIT

fb-ad-video-studio

Direct-response video ads built as code. HyperFrames (HeyGen's HTML-video framework) is the engine; this skill is the opinionated FB/IG/TikTok ad recipe layered on top — distilled from 23 production iterations of a real founder ad.

It produces two families:

  1. Motion-graphics spot — kinetic typography + Lottie/CSS motion + VO + SFX. No presenter footage. Fully self-serve, most templatable, fastest.
  2. Talking-head founder ad — presenter video with a motion-graphic overlay rail, the proven speaker PIP arc, synced captions, and an SFX rail. Needs a user-supplied recording.

Prerequisites (do not reimplement the framework)

This skill assumes HyperFrames is available. It depends on, and defers framework mechanics to, these skills/tools — invoke them, don't duplicate them:

  • npx hyperframes CLI (Node ≥ 22, FFmpeg, Docker for render on Pop!_OS/Linux)
  • hyperframes skill — composition authoring, data-* semantics, GSAP timeline registration
  • hyperframes-cli skill — init / lint / inspect / preview / render
  • hyperframes-media skill — TTS / transcribe / background removal

If those skills aren't installed: npx skills add heygen-com/hyperframes.

Brand fidelity (DESIGN.md) — check FIRST

Before any client ad, run the pre-flight check. If a DESIGN.md exists (composition project root, or ~/claude_work/brand-kits/<slug>/), it is the source of truth. HyperFrames reads DESIGN.md/design.md natively, so syncing the client file into the composition project root makes brand colors/fonts authoritative with zero code change; also map its tokens onto the template :root vars (--accent, --ink, --bg, --mute) so captions, kinetic type, lockup, and PIP borders are on-brand. Unresolved TODO: VERIFY or lint errors → stop and resolve with the user. No DESIGN.md → use the brand-kit skill to extract + verify one. Never guess a client's brand. Keep the proven structure/pacing; only the brand layer comes from DESIGN.md.

When to use / not

Use: any paid social video ad — motion-graphics spot, founder/UGC talking-head ad, kinetic-type promo, Reel/Short, "make a video like this winning ad".

Don't use: static images → image-studio. Long-form/explainer/YouTube content → generic hyperframes. AI-generated talking avatars → HeyGen avatar tools. Diagrams → excalidraw.

The two templates

Copy a folder from templates/ into a fresh project and edit.

TemplateLengthNeedsBest for
motion-graphics-spot/15–30sVO script onlyOffer/feature ads, retargeting, no on-camera talent
talking-head-founder-ad/45–75sPresenter recording + VOFounder story, trust/authority, cold traffic

Both ship as real HyperFrames compositions (index.html + hyperframes.json + meta.json) with generic brand tokens (--accent, --ink, …) and [BRACKET] placeholder copy. They are scaffolds — rework freely.

Production pipeline (the proven order)

Run scripts from scripts/. This order is the one that survived 23 iterations:

  1. Scriptscript.md. One idea, problem-first, reader is the hero. Keep beats ≤ 5s.
  2. VOscripts/tts.py (ElevenLabs Chris, settings baked in). Talking-head: record the presenter instead.
  3. Cut silences (talking-head only) → scripts/cut-silences.py — whisper VAD + ffmpeg, trims to 0.18s gaps. Typical 77s→67s.
  4. Re-whisper the cutscripts/whisper-words.py — word-level timestamps drive every caption, SFX, and visual sync. Timings shift after cutting; always re-whisper.
  5. Re-encode footage (talking-head only) → scripts/reencode-footage.sh — fixes sparse-keyframe seek failures.
  6. Build the composition — edit the template HTML; anchor captions/SFX/reveals to the whisper word starts.
  7. SFXscripts/fetch-sfx.sh — Mixkit, tight-trimmed. VO + SFX beats high-converting ads; music is optional (mix −18 dB if used).
  8. Lint + inspectnpx hyperframes lint && npx hyperframes inspect.
  9. Renderscripts/render.sh (wraps npx hyperframes render --docker).

See references/ for the full reasoning behind every step.

Reverse-template workflow (the unlock)

Same superpower as image-studio, for video. When the user shares a winning video ad ("make one like this", "match this competitor", "build a template from this"):

  1. Get the reference (file/URL). Extract the structure, not the content.
  2. Map the scene arc (hook / problem / agitate / solution / proof / CTA) with timestamps.
  3. Measure pacing — cut cadence, word-reveal duration, beat length, SFX density.
  4. Note the device — talking-head PIP arc? pure kinetic type? screen-recording inset?
  5. Write a parametric HyperFrames composition with variables for the swappable parts (VO, presenter clip, brand tokens, headline beats, proof points, CTA, offer).
  6. Render a build to validate the structure matches.
  7. Save it in templates/ as <vertical>-<hook>.html. The template is the deliverable; future ads reuse it with new VO/footage/brand.

Battle-tested patterns (apply by default)

Full detail in references/patterns.md and references/audio-sources.md. The non-negotiables:

  • Cold-open with the motion graphic alone (speaker hidden 0–3s) — strongest hook.
  • Speaker PIP arc: hidden → PIP → full → PIP → full → hidden (lockup). Rhythmic variety, never "covered".
  • Lottie clip wrappers alive from t=0 (data-start="0" full duration); control visibility with GSAP opacity, not late data-start. Looping seek must be clip-relative.
  • No CSS transform on GSAP-animated elements — GSAP overwrites the whole property. Center with gsap.set(el,{xPercent:-50}).
  • Captions at y≈1180–1400, anchored to actual whisper word starts (not script estimates). Split beats > 5s.
  • Pacing: 0.18s drift cuts (not 0.5s fades), 0.25s word reveals, ElevenLabs <break> ≤ 0.4s.
  • Audio: Mixkit SFX (previews are full clips), trim tight (whoosh 0.6s, punch 0.22s, ding 0.65s). Skip music for direct-response, or −18 dB. ElevenLabs SFX = mushy, avoid.
  • Render: Docker required on Pop!_OS. Fix media_missing_id (silent audio), add hard-kill tl.set(el,{opacity:0}) after fades for seek safety.

Output

A rendered MP4 (1080×1920 vertical default; 1080×1080 and 1080×1350 supported) plus the reusable composition. Deliver renders to the user; save the generalized composition back to templates/ when it's a new reusable structure.

Como adicionar

/plugin marketplace add ai-agents-for-agencies-coaches/fb-ad-video-studio

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.