Visual Review
You are about to inspect a rendered document the way a careful designer would: looking for things that will embarrass the user if shipped. Your job is to catch what the eye catches, but at machine scale and with consistency a tired human can't match.
What this skill is for
Reviewing the visual output of a document — what it looks like when rendered, not what the source code says. The render is the source of truth. A flawless HTML file that produces a clipped PDF is still a broken document.
Modes
Quick Pass (default): Phases 0, 1, 2, 3, 8, 9.
- What runs: precondition gate + input identification + render + per-page visual pass + report + fix-guidance
- What skips: cross-page consistency, type-specific pass, diff against reference, mechanical fixes
- Time: ~5–10 min for ≤20 pages
- Report sections: HOLD, FIX IF TIME, NOTE, BEYOND VISUAL, SUPPRESSED FINDINGS, MECHANICAL FIXES (skipped)
- Best for: time-constrained review, locked PDFs, first pass on any document, sandboxed environments
Full Review: Phases 0–9 (all phases).
- Requires: editable source, shell access (pdftoppm or Playwright)
- Time: ~30–60 min depending on page count
- Report sections: all sections including CROSS-PAGE DRIFT and TYPE-SPECIFIC CHECKS
- Best for: final pre-delivery QA, print production, decks with known brand drift
Text-only pass (fallback when no rendering available):
- Triggered by: Phase 0 detecting no pdftoppm and no Playwright
- What runs: content issues only — no visual findings
- Every finding labelled:
[TEXT-INFERRED — not visually confirmed] - Not a visual review — label the report accordingly
Select mode in Phase 0. Quick Pass is the default. Full Review requires explicit opt-in.
Scope — start broad, narrow later
Visual review is the primary job. But while you're reading the document carefully, you'll often notice issues adjacent to visual quality: a TOC entry pointing at the wrong page, a spec contradicting itself between sections, a citation that doesn't resolve, a price that disagrees with the price two pages later. Don't suppress those findings. A reviewer who notices a real problem and stays silent because "that's not visual" is doing the user a disservice. Surface them under a separate "Beyond visual" section in the report so the user can see at a glance what's strict-scope and what's bonus.
The only things genuinely out of scope:
- Accessibility tooling (screen-reader behaviour, WCAG contrast measurement) — different concern, different skill.
- Copy editing prose — you can flag a confusing sentence, but don't rewrite it.
If you're unsure whether something is in scope, surface it. The user can ignore noise faster than they can recover from a missed HOLD.
The workflow
You walk through these phases in order. Don't skip phases unless explicitly told the input is unsuitable for that phase (e.g. no source available → skip Phase 7 — mechanical fixes).
0. Precondition gate → 1. Identify input → 2. Render + capture metadata
↓
[Quick Pass: skip to Phase 8 after Phase 3]
↓
3. Per-page visual pass → 4. Cross-page consistency → 5. Type-specific pass
↓
6. Diff against reference → 7. Mechanical fixes (Full Review + editable source only)
↓
8. Action-first report → 9. Fix-guidance export
Phase 0 — Precondition gate (Quick Pass + Full Review)
Run before any work begins. Surface the results to the user, then branch.
# Check rendering environment
which pdftoppm 2>/dev/null && echo "pdftoppm: available" || echo "pdftoppm: NOT available"
npx playwright --version 2>/dev/null && echo "playwright: available" || echo "playwright: NOT available"
Present a summary:
Environment check:
pdftoppm: [available | NOT available]
playwright: [available | NOT available]
Source: [editable | read-only | not provided]
Mode: [to be selected]
Branch: no rendering available
If both pdftoppm and playwright are unavailable, inform the user:
"No rendering environment available. I can run a text-only pass covering content issues (TOC mismatches, cross-reference errors, factual contradictions) but cannot assess visual problems (clipping, overlaps, font fallback, density). Every finding will be labelled
[TEXT-INFERRED — not visually confirmed]. This is not a visual review — proceed? [y/n]"
If yes: proceed but apply the text-only label to all findings. Skip Phase 3–7. Go to Phase 8 with a TEXT-ONLY PASS warning in the report header.
Branch: source not editable
If source is a locked PDF, client-delivered file, or otherwise not writable:
- Disable Phase 7 (mechanical fixes cannot run)
- Phase 9 output will be "fix-guidance only" (no source edits)
- Note in report header:
Source: read-only — mechanical fixes skipped
Mode selection (ask once)
"Quick Pass (Phases 0–3 + 8–9, ~5–10 min for ≤20 pages) or Full Review (all phases, requires editable source, ~30–60 min)? Default: Quick Pass."
Quick Pass skips Phases 4, 5, 6, and 7. Quick Pass report omits the CROSS-PAGE DRIFT and TYPE-SPECIFIC sections.
If page count > 50 and Full Review selected:
"This document has [N] pages. Full Review on 50+ pages generates 35+ findings. The fix loop will not be reliable at this scale — consider Quick Pass or reviewing in 20-page sections."
Commit the mode and render-method to the report header before proceeding.
HTML source overflow pre-flight (mandatory when HTML source is available)
If the input includes an HTML source file with fixed-height pages (any page using height: 210mm, height: 794px, or overflow: hidden in print CSS), run the Playwright overflow audit BEFORE exporting to PDF:
// Serve the HTML via local HTTP server first if file:// is blocked
// python3 -m http.server 7823 --directory <project-root>
await page.goto('http://localhost:7823/<path-to-html>');
const overflowReport = await page.evaluate(() => {
const results = [];
document.querySelectorAll('section.page, .page, [class*="page"]').forEach((page, i) => {
const overflow = page.scrollHeight - page.clientHeight;
const label = page.getAttribute('data-screen-label') || page.id || `Element ${i+1}`;
const isLongForm = page.classList.contains('page--long-form') ||
page.classList.contains('long-form');
if (overflow > 2 && !isLongForm) {
results.push({ index: i+1, label, overflowPx: overflow });
}
});
return results;
});
Every page with overflowPx > 0 is a guaranteed HOLD finding — content will be silently clipped in the PDF. Do not proceed to PDF export without resolving or explicitly noting all overflow findings.
Pages within 10px of the boundary (0 < overflowPx ≤ 10) should also be checked visually in the PDF — Chrome's print rendering can clip content that appears to fit in screen mode due to rounding differences.
Note: page--long-form pages are excluded from this check because they intentionally overflow in screen mode and use height: auto in @media print.
Phase 1 — Identify input
Ask yourself (and the user if unclear):
- What was given? A PDF path? An HTML file? Both (HTML source + rendered PDF)? A URL?
- What kind of document? Slide deck, book/manuscript, multi-page report, brochure, single page. This determines which
references/<type>.mdto load later. - Is there a brand reference? A previous version, template, or brand-guidelines PDF to compare against?
- Is source available? If yes, source-mapping is possible (you can point at the line that produced the glitch, and you can apply mechanical fixes). If no, you can only describe page+region.
If any of this is ambiguous, ask one short clarifying question. Do not guess — guessing on doc type leads to applying the wrong reference checklist.
Phase 2 — Render + capture metadata
For PDFs, use pdftoppm (poppler) —