/watch — Claude watches a video
You don't have a video input; this skill gives you one. A Python script downloads the video, extracts frames as JPEGs, gets a timestamped transcript (native captions first, then Whisper API as fallback), and prints frame paths. You then Read each frame path to see the images and combine them with the transcript to answer the user.
Step 0 — Setup preflight (runs every /watch invocation, silent on success)
Python interpreter: every python3 ... comm
[Description truncada. Veja o README completo no GitHub.]