yt2bb — YouTube to Bilibili Video Repurposing

Overview

Six-step pipeline: download → transcribe → translate → merge → burn subtitles → generate publish info. Produces a video with hardcoded bilingual (EN/ZH) subtitles and a publish_info.md with Bilibili upload metadata.

When to Use

User provides a YouTube URL (single video or playlist) and wants a Bilibili-ready version
User needs bilingual EN-ZH subtitles burned into video
User wants to repurpose English video content for Chinese audience

Quick Reference

Step	Tool	Command	Output
0. Update	`git`	Auto-check for skill updates	—
1. Download	`yt-dlp`	`yt-dlp --cookies-from-browser chrome -f ... -o ...`	`{slug}.mp4`
2. Transcribe	`whisper`*	`srt_utils.py check-whisper` then transcribe	`{slug}_{lang}.srt`
2.5 Validate	`srt_utils.py`	`srt_utils.py validate / fix`	`{slug}_{lang}.srt` (fixed)
3. Translate	AI	SRT-aware batch translation	`{slug}_zh.srt`
4. Merge	`srt_utils.py`	`srt_utils.py merge ...`	`{slug}_bilingual.srt`
4.5 Style	`srt_utils.py`	`srt_utils.py to_ass --preset netflix\|clean\|glow`	`{slug}_bilingual.ass`
5. Burn	`ffmpeg`	`ffmpeg -c:v libx264 -vf ass=...`	`{slug}_bilingual.mp4`
6. Publish	AI	Analyze content, generate metadata	`publish_info.md`

Update check

Throttle to one check per 24 hours per installation; never mutate the skill directory without explicit user consent. SKILL_DIR resolved here is reused by later pipeline steps for script paths.

If <this-skill-dir>/.last_update exists and is less than 24 hours old, skip this step entirely.

Otherwise, fetch the latest tag from upstream:

git -C <this-skill-dir> ls-remote --tags origin 'v*' 2>/dev/null \
  | awk '{print $2}' | sed 's|refs/tags/||' \
  | sort -V | tail -1

Compare with this skill's metadata.version from the frontmatter. If the upstream tag is strictly newer (semver), tell the user one line and ask:

"A newer version of this skill is available: vX.Y.Z → vA.B.C. Want me to git pull?"

If they say yes, run git -C <this-skill-dir> pull --ff-only. Refresh .last_update either way so the prompt doesn't repeat for 24 hours.
If upstream is the same or older, refresh .last_update silently and continue.
On any failure (offline, not a git checkout — e.g. ClawHub-installed copy, read-only path, no permission), swallow the error silently and continue with the user's task. Do not mention the failure.

Resolve SKILL_DIR for use by later pipeline steps:

# Find skill directory (works across Claude Code, OpenClaw, Hermes, Pi)
SKILL_DIR="$(find ~/.claude/skills ~/.openclaw/skills ~/.hermes/skills ~/.pi/agent/skills ~/.agents/skills ~/myagents/myskills -maxdepth 2 -name 'yt2bb' -type d 2>/dev/null | head -1)"

Pipeline Details

Step 1: Download

Single video:

slug="video-name"  # or: slug=$(python3 "$SKILL_DIR/srt_utils.py" slugify "Video Title")
mkdir -p "${slug}"
yt-dlp --cookies-from-browser chrome \
  -f "bv*[ext=mp4]+ba[ext=m4a]/b[ext=mp4]" \
  -o "${slug}/${slug}.mp4" "https://www.youtube.com/watch?v=VIDEO_ID"

Playlist / series:

yt-dlp --cookies-from-browser chrome \
  -f "bv*[ext=mp4]+ba[ext=m4a]/b[ext=mp4]" \
  -o "%(playlist_index)03d-%(title)s/%(playlist_index)03d-%(title)s.mp4" \
  "https://www.youtube.com/playlist?list=PLAYLIST_ID"

After downloading, rename each folder to a clean slug and run Steps 2–6 for each video sequentially.

-f "bv*[ext=mp4]+ba[ext=m4a]/b[ext=mp4]": ensure mp4 output, avoid webm
%(playlist_index)03d: zero-padded index to preserve playlist order
If --cookies-from-browser fails, export cookies first — see Troubleshooting

Step 2: Transcribe

First run the environment check to detect your platform and get a tailored whisper command:

python3 "$SKILL_DIR/srt_utils.py" check-whisper

This auto-detects OS, GPU (CUDA/Metal/CPU), memory, and installed backends, then recommends the best backend + model for your hardware. If memory detection is unavailable, it falls back conservatively instead of assuming a low-memory machine. Use the command it prints.

Manual fallback (openai-whisper, works everywhere):

src_lang="en"      # Change to ja/ko/es/etc. based on source video
whisper_model="medium"  # check-whisper recommends the best model for your hardware
whisper "${slug}/${slug}.mp4" \
  --model "$whisper_model" \
  --language "$src_lang" \
  --word_timestamps True \
  --condition_on_previous_text False \
  --output_format srt \
  --max_line_width 40 --max_line_count 1 \
  --output_dir "${slug}"
mv "${slug}/${slug}.srt" "${slug}/${slug}_${src_lang}.srt"

Supported backends:

Backend	Best for	Install
`mlx-whisper`	macOS Apple Silicon (fastest)	`pip install mlx-whisper`
`whisper-ctranslate2`	Windows/Linux CUDA, or CPU (~4x faster)	`pip install whisper-ctranslate2`
`openai-whisper`	Universal fallback	`pip install openai-whisper`

Model selection (auto-recommended by check-whisper):

tiny — fast draft, low accuracy, CPU-friendly (~1 GB)
medium — default, good balance (~5 GB)
large-v3 — best accuracy, recommended for JA/KO/ZH source (~10 GB)

Notes:

--language: explicitly set to avoid misdetection; supports en, ja, ko, es, etc.
--word_timestamps True: more precise subtitle timing
--condition_on_previous_text False: prevent hallucination loops
If output is garbled or repeated, add anti-hallucination flags — see Troubleshooting

Step 2.5: Validate & Fix (optional)

python3 "$SKILL_DIR/srt_utils.py" validate "${slug}/${slug}_${src_lang}.srt"
# If issues found:
python3 "$SKILL_DIR/srt_utils.py" fix "${slug}/${slug}_${src_lang}.srt" "${slug}/${slug}_${src_lang}.srt"

Step 3: Translate

Read {slug}_{src_lang}.srt and translate to Chinese. Critical rules:

These rules are modeled on the Netflix Simplified Chinese Timed Text Style Guide; follow them to produce broadcast-grade subtitles.

Keep SRT format intact — preserve index numbers, timestamps (--> lines) exactly as-is
1:1 entry mapping — every source entry must produce exactly one translated entry (same count)
Optimize for bottom subtitles — keep each Chinese entry to 1 line whenever possible so the final bilingual subtitle stays compact near the bottom of the frame
Max 16 full-width characters per line (Netflix SC spec). Prefer 12–16; if a cue is very short (< 1 s) compress further so reading speed stays ≤ 9 characters/second
Shorten with judgment, not mechanically — remove filler words, repeated subjects, weak interjections, and redundant politeness before dropping key meaning
Match subtitle duration — the line must feel readable within the time on screen; if the cue is very short, compress more aggressively
No trailing punctuation on Chinese cues — drop ending 。, ！, ？; keep mid-sentence ，, 、, ； only when they add clarity
Use full-width Chinese punctuation inside cues (，。！？、；：); use 「」 for inner quotes, not "" or ''
Half-width digits and Latin — numbers, units, product names, and code identifiers stay half-width (GPT-4, 30fps, 2026); only punctuation is full-width
Line-break discipline — never break after function words (的, 了, 吗, 呢, 吧, 啊); never split an English phrasal unit across a line break; keep modifiers with their heads
Keep terminology consistent — technical terms, names, product names, and recurring phrases should be translated the same way across batches. Maintain an inline glossary if needed
Adapt, don't transliterate — preserve register, tone, and intent over literal word matching; idioms become natural Chinese equivalents
Translate in batches of 10 entries —

yt2bb

How to add

Drop this on your repo README

Related skills

claude-api

skill-creator

claude-mem

oh-my-issues

Get new Desenvolvimento skills every Monday