Animation Review

Review animations and interactions by recording the browser and sending the video to Gemini for structured analysis.

Prerequisites

ffmpeg installed (brew install ffmpeg)
playwright Python package with Chromium (pip install playwright && playwright install chromium)
google-genai Python package (pip install google-genai)
GEMINI_API_KEY environment variable set
For manual recording only: macOS Screen Recording permission granted to terminal app

How Gemini fits in

Gemini is the eyes — it watches the recording and describes what it sees with precision. You are the hands — you translate those observations into code changes with full codebase context.

Treat all Gemini analysis as observational evidence, not authoritative diagnosis. Gemini cannot see the code. When it suggests root causes or implementation fixes, treat these as hypotheses from an external observer who can see the symptoms but not the source. Its frame-level descriptions of what happens visually are reliable. Its theories about why are informed guesses that you should verify against the actual code.

This applies across all modes, but especially to diagnose (where Gemini hypothesizes about bugs) and inspire (where Gemini decomposes effects without knowing your tech stack).

Interpreting timestamps and durations

Gemini can only observe what's in the sampled frames. Its temporal precision depends on the analysis FPS:

Mode	FPS	Frame interval	What this means
check	5	200ms	A "300ms animation" could be anywhere from 200-400ms (1-2 frames). Timestamps are ±200ms. Good enough for "does it happen" but not for timing accuracy.
review	12	83ms	A "300ms animation" is ~3-4 frames. Easing character is visible. Durations are accurate within ~80ms — enough to judge "too fast" vs "too slow".
diagnose/inspire	24	42ms	A "300ms animation" is ~7 frames. Easing curves, stagger offsets, and glitch moments are precisely observable. Durations are accurate within ~40ms.

When Gemini reports a duration like "~400ms with ease-out", consider the mode's precision. At 5fps that could really be 200-600ms. At 24fps it's likely 360-440ms. The system prompt tells Gemini to report in frame counts alongside milliseconds so you can judge precision yourself.

If you're debugging a timing issue and Gemini's temporal precision isn't sufficient, re-run at a higher mode. A check that reveals "something's off with the timing" can be followed up with a diagnose for frame-precise detail. You can also escalate just a specific time range — see "Escalating a region" below.

Workflow

1. Determine source and mode

There are two entry points — figure out which applies:

User provides a video file. The user has a screen recording (.mov, .mp4, .webm) — either of their own UI or a reference they want to recreate. Skip straight to step 3 (Analyze). No recording step needed. User-provided videos are typically 30-60fps, which is more than enough for any analysis mode — Gemini downsamples to the mode's FPS automatically.

Agent captures from the browser. The animation is running in a local dev server and needs to be recorded. Proceed to step 2 (Record), then step 3 (Analyze).

Choose the analysis mode based on what question you're trying to answer:

Mode	FPS	Model	Question	When to use
check	5	Flash	"Does it work?"	First pass — verify the animation fires, completes, and doesn't visually break. Early development, after wiring up a new animation, smoke testing.
review	12	Flash	"How does it feel?"	Design and polish — evaluate easing, timing, choreography, and overall quality. Use when the animation works but needs refinement, or for design review before shipping.
diagnose	24	Pro	"What's going wrong?"	Bug investigation — the animation has a specific visual problem (jump, stutter, misalignment, timing glitch). Need frame-precise evidence to locate the root cause in code.
inspire	24	Pro	"What's happening here?"	Reference analysis — the user has a video showing a desired effect and wants a technical decomposition to guide implementation.

Decision guide for agents:

User says "check if this works" / "does the animation play" / just implemented something new → check
User says "how does this look" / "review the animation" / "is the timing right" / design feedback → review
User says "there's a bug" / "it jumps" / "something's wrong with" / describes a specific visual issue → diagnose
User provides a video of someone else's UI / "I want it to look like this" / "recreate this effect" → inspire
- For inspire: if the user hasn't said which part of the video or which specific effect to focus on, ask them before running the analysis. A full-video decomposition without focus will be vague and unhelpful.

If the mode isn't clear from context, ask the user.

2. Record (skip if user provided a video)

The recording step captures browser interactions as video. Record generously, analyze precisely — Playwright's video always includes the full session (page load through close), so capture more than you need. The recorder always records at 24fps (the Gemini analysis maximum) so that any analysis mode can sample at full resolution, including escalation of a specific time range to diagnose. The recorder logs a timeline showing when each action executed relative to the video start — use these timestamps with --start/--end on analyze.py to focus Gemini on the relevant window.

Automated (preferred) — Claude drives the browser

Use record_browser.py when Claude is performing the interactions. Playwright records the page internally, then transcodes to mp4. No screen recording permissions needed.

# Record a page load animation
python3 ~/.claude/skills/animation-review/scripts/record_browser.py http://localhost:3000

# Click a button, wait for animation, record everything
python3 ~/.claude/skills/animation-review/scripts/record_browser.py http://localhost:3000 \
  -a 'click:.play-btn' -a 'wait:2000'

# Carousel: click through slides
python3 ~/.claude/skills/animation-review/scripts/record_browser.py http://localhost:5173/carousel \
  -a 'wait:1000' -a 'click:.next' -a 'wait:800' -a 'click:.next' -a 'wait:800'

# Scroll-triggered animation
python3 ~/.claude/skills/animation-review/scripts/record_browser.py http://localhost:3000 \
  -a 'scroll:500' -a 'wait:1000' -a 'scroll:500' -a 'wait:1000'

# Custom viewport size
python3 ~/.claude/skills/animation-review/scripts/record_browser.py http://localhost:3000 \
  -W 1920 -H 1080 -a 'click:.hero-cta' -a 'wait:3000'

Action format for -a:

wait:MS — wait N milliseconds
click:SELECTOR — click an element
scroll:PIXELS — scroll down (negative for up)
hover:SELECTOR — hover over an element
press:KEY — press a key (Enter, Tab, ArrowRight, etc.)
type:SELECTOR|TEXT — type into an element (pipe separates selector from text)

Construct actions from what you know about the animation trigger. For page-load animations, no actions are needed — the recording captures from navigation onward.

Use --wait-before to adjust the pause after page load (default 500ms) and --wait-after to adjust the pause after the last action (default 1000ms).

The recorder prints a timeline to stderr showing when each action ran:

  → click:.play-btn (at 1.8s)
  → wait:2000 (at 1.9s)
Timeline: actions 1.8s–3.9s, total 4.9s

These timestamps are measured Python-side (wall clock), not from the video encoder, so they're approximate — always add ~1s buffer on each side when using --start/--end. For the example above, use --start 1s --end 5s.

After recording: notify the user

Once the recording finishes, **immediately move it to `.animatio

animation-review

How to add

Drop this on your repo README

Related skills

webapp-testing

brand-guidelines

frontend-design

mcp-builder

Get new Design e Frontend skills every Monday