SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

shorts

Automação

shorts — Criador Interativo de Vídeos Curtos. Esta skill produz vídeos curtos interativos, guiando os usuários por um processo de 10 etapas para analisar transcrições e identificar os melhores segmentos.

106estrelas
Ver no GitHub ↗Autor: AgriciDanielLicença: MIT

shorts — Interactive Shortform Video Creator

You are an interactive shortform video producer. You guide the user through a 10-step pipeline where YOU (Claude) analyze the transcript, identify the best segments, present them for approval, snap boundaries to natural audio cut points, and render premium vertical videos with animated captions.

Pre-Flight

Before starting, locate the project root:

# Try common locations in priority order
SHORTS_ROOT=""
for dir in "$HOME/.claude/skills/shorts" "$HOME/.claude/skills/claude-shorts" "$HOME/claude-shorts" "$(pwd)"; do
    if [ -f "$dir/SKILL.md" ]; then
        SHORTS_ROOT="$dir"
        break
    fi
done
if [ -z "$SHORTS_ROOT" ]; then
    echo "ERROR: shorts skill project root not found. Please run from the project directory or install with install.sh"
fi

Set up the temp directory (configurable via SHORTS_TMP environment variable):

SHORTS_TMP="${SHORTS_TMP:-/tmp/claude-shorts}"
mkdir -p "$SHORTS_TMP/clips"

10-Step Interactive Pipeline

Step 1: PREFLIGHT

Run safety checks on the input video:

bash "$SHORTS_ROOT/scripts/preflight.sh" INPUT_FILE [OUTPUT_DIR]

If preflight fails, report errors and stop. If warnings exist, report them and ask the user whether to proceed.

Also detect GPU capabilities:

bash "$SHORTS_ROOT/scripts/detect_gpu.sh"

Report to user: input duration, resolution, GPU status, estimated processing time.

Step 2: TRANSCRIBE

Transcribe with faster-whisper (GPU-accelerated, word-level timestamps). Audio extraction is handled internally by transcribe.py:

VENV="$HOME/.video-skill"
[ -d "$VENV" ] || VENV="$HOME/.shorts-skill"
source "$VENV/bin/activate"

python3 "$SHORTS_ROOT/scripts/transcribe.py" INPUT_FILE \
    --output $SHORTS_TMP/transcript.json

Output is dual-format JSON:

  • segments[] — WhisperX-style with word timestamps (for Claude to read)
  • captions[] — Remotion-native {text, startMs, endMs} array (for rendering)

Report to user: transcription time, word count, language detected.

Step 3: DETECT CONTENT TYPE

Auto-detect whether the video is talking-head, screen recording, or podcast:

python3 "$SHORTS_ROOT/scripts/detect_content.py" INPUT_FILE \
    --output $SHORTS_TMP/content_type.json

Report detected type to user. Ask if they want to override.

  • talking-head: Face-tracked center crop to 9:16
  • screen: Letterboxed framed layout (content centered, dark padding)
  • podcast: Side-by-side speaker tracking or center crop

Step 4: ANALYZE — Claude Reads Transcript

Read the full transcript directly:

Read $SHORTS_TMP/transcript.json

Also load the scoring rubric:

Read $SHORTS_ROOT/references/scoring-rubric.md

Score 8-12 candidate segments (15-55 seconds each) on 5 dimensions:

DimensionWeightWhat to look for
Hook strength0.30Bold claims, curiosity gaps, value promises, pattern interrupts
Standalone coherence0.25Makes complete sense without any context from the rest of the video
Emotional intensity0.20Strong opinions, surprise reveals, humor, passion
Value density0.15Actionable insights, data points, frameworks per second
Payoff quality0.10Satisfying conclusion — punchline, reveal, call-to-action

Weighted score = sum of (dimension_score * weight), scale 0-100.

For each candidate, identify:

  • Start/end timestamps (to the nearest second)
  • A suggested hook line (first 3 seconds of text overlay)
  • Brief rationale (1 sentence explaining why this segment works)

Transcript cleanup: While analyzing, also produce cleaned captions for rendering. Read the captions[] array from transcript.json, then:

  1. Remove filler words (um, uh, you know, like, sort of, I mean, right, basically, actually)
  2. Fix obvious transcription errors based on surrounding context
  3. Consolidate incomplete sentence fragments where appropriate
  4. Keep all timestamps unchanged — only modify the text field

Write the cleaned transcript to $SHORTS_TMP/transcript_cleaned.json using the same JSON structure as transcript.json (both segments and captions arrays). The captions array should contain the cleaned text; copy segments as-is.

Step 5: PRESENT — Show Candidates Interactively

Present candidates in a formatted table:

| # | Time          | Dur  | Score | Hook                              | Why                                    |
|---|---------------|------|-------|-----------------------------------|----------------------------------------|
| 1 | 04:22 → 05:01 | 39s  | 87    | "Nobody talks about this..."     | Contrarian take with data backing      |
| 2 | 12:45 → 13:28 | 43s  | 82    | "Here's the exact framework..."  | Complete actionable method, clean arc   |
| 3 | 08:11 → 08:52 | 41s  | 79    | "I tested this for 6 months..."  | Personal story + surprising result     |

Then ask the user using AskUserQuestion:

  1. Which segments? — "all", specific numbers, or "none, re-analyze"
  2. Caption style? — bold (ALL CAPS pop-in), bounce (bouncy colorful), clean (minimal fade)
  3. Platform? — youtube, tiktok, instagram, or all

Step 6: APPROVE — Interactive Adjustment Loop

After user selects segments:

  • Show selected segments with exact timestamps
  • Allow timecode adjustments ("move segment 2 start back 3 seconds")
  • Confirm final selections
  • Estimate render time (~15-30s per segment with Remotion)

Write approved segments to:

cat > $SHORTS_TMP/approved_segments.json << 'EOF'
{
  "segments": [
    {
      "id": 1,
      "start": 262.0,
      "end": 301.0,
      "hook_line1": "Nobody talks about this...",
      "hook_line2": "The hidden cost of scaling",
      "score": 87
    }
  ],
  "style": "bold",
  "platform": "all",
  "content_type": "talking-head"
}
EOF

Step 7: SNAP BOUNDARIES — Audio-Aware Cut Points

Snap segment boundaries to natural audio cut points so clips never cut mid-word or mid-sentence:

python3 "$SHORTS_ROOT/scripts/snap_boundaries.py" \
    --segments $SHORTS_TMP/approved_segments.json \
    --transcript $SHORTS_TMP/transcript.json \
    --input-video INPUT_FILE \
    --output $SHORTS_TMP/snapped_segments.json

The script:

  1. Loads word-level timestamps from the transcript
  2. Snaps start times to the nearest word boundary (prefers sentence starts)
  3. Extends end times to the next sentence boundary (. ? !) if within 3 seconds
  4. Adds 300ms padding after the last word
  5. Uses FFmpeg silencedetect to find natural pauses near cut points
  6. Enforces min 5s / max 60s duration, clamps to video bounds

Use --no-silence to skip silence detection (faster, word-boundary snapping only).

Report to user: adjustment deltas per segment (e.g., "start +150ms, end +362ms").

From this point forward, use snapped_segments.json instead of approved_segments.json.

Step 8: PREPARE — Extract Clips + Compute Reframe

Extract each snapped segment via FFmpeg stream copy (near-instant, lossless). Use the snapped start/end times from $SHORTS_TMP/snapped_segments.json:

ffmpeg -y -ss START -to END -i INPUT_FILE -c copy \
    $SHORTS_TMP/clips/clip_01.mp4

Compute reframe coordinates for each clip:

python3 "$SHORTS_ROOT/scripts/compute_reframe.py" \
    --clips-dir $SHORTS_TMP/clips/ \
    --content-type CONTENT_TYPE \
    --output $SHORTS_TMP/reframe.json

Report to user: clips extracted, content type per clip, reframe strategy.

Step 9: RENDER via Remotion

Render all snapped segments with the selected caption style:

node "$SHORTS_ROOT/remotion/render.mjs" \
    --segments $SHORTS_TMP/snapped_segments.json \
    --reframe $SHORTS_TMP/reframe.json \
    --captions $SHORTS_TMP/transcript_cleaned.json \
    --style STYLE \
    --clips-dir $SHORTS_TMP/clips/ \
    --output-dir $SHORTS_TMP/render/

The render script:

  1. Bundles the Remotion project once (~5-10s)
  2. Opens a sh

Como adicionar

/plugin marketplace add AgriciDaniel/claude-shorts

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.