Website Replication
Required tooling: at least one browser automation MCP (Chrome MCP / Playwright MCP / Claude Preview). Without it, the skill degrades to static HTML fetch and every interaction is forced to
inferred; coverage drops to 0% by definition. See Tooling for the full list and Troubleshooting for the no-browser fallback.Agent harness: this skill is markdown + JS + YAML; any agent harness that can load markdown skills and call file + browser tools can run it. Claude Code and OpenAI Codex are the tested harnesses; other harnesses work by symlinking or copying the directory into their skill location.
中文导读:SKILL.zh.md(人类阅读用镜像;agent 仍加载本英文版)
Audit any reference website — typically a competitor, but the same workflow applies to legacy versions of your own product, partner integrations, or inspiration sources you want to learn behavior from. "Competitor" throughout this document means the site being audited, not necessarily a market rival.
Core Rule
Replicate useful product behavior, not protected expression. Do not copy logos, exact copy, proprietary assets, distinctive page composition, or a page structure so literally that it creates infringement risk. Preserve workflow intent, configuration fields, state handling, and backend capability mapping, while using original branding, copy, imagery, and visual rhythm unless the user explicitly asks for research-only comparison.
Out Of Scope
- Legal / IP / ToS compliance judgement. Flag risk, do not make a legal call. Recommend the user consult counsel for trademark, patent, or ToS questions.
- Performance benchmarking, SEO ranking, or marketing-channel analysis. Not feature parity. Use a dedicated skill.
- Pure visual inspiration ("their UI feels nicer"). This skill assumes you want behavior + structure parity, not just a style guide; reach for a lighter style-extraction tool instead.
- Authenticated or paid state the user cannot legitimately access. Mark blocked. Do not bypass auth, paywalls, rate limits, robots restrictions, or technical protections.
Evidence Safety
-
Redact secrets and personal data before saving or reporting evidence: cookies, authorization headers, session IDs, tokens, account identifiers, customer data, uploaded file contents, message/prompt contents that may contain private data, and one-time URLs.
-
For network traces, record method, route pattern, auth class, redacted payload shape, response shape, status code, and error class. Do not paste raw credential-bearing headers or full private payloads.
-
Respect access boundaries. If a state requires paid or authenticated access that is unavailable, mark it as blocked and infer only from visible evidence or official docs.
-
Audit output lives in the consumer project (wherever the skill is invoked), not in the skill repo. Add the following block to that project's
.gitignoreso the snapshot cache survives across machines while high-risk evidence stays local:# Audit output written by website-replication-skill. # Tracks PNG screenshots and MANIFEST.md (the cross-audit snapshot index); # ignores DOM dumps, network traces, reports, and other high-risk types. # IMPORTANT: review every PNG for visible PII (usernames, emails, customer # content, internal IDs) BEFORE committing. # To lock the directory down entirely, replace this whole block with: audit/ audit/**/*.html audit/**/*.htm audit/**/*.har audit/**/*.har.gz audit/**/*.json audit/**/*.md audit/**/*.txt audit/**/*.log audit/**/*.csv !audit/**/MANIFEST.md audit/**/network/ audit/**/dom/ audit/**/reports/If the consumer project is public, or you cannot guarantee PNG redaction, prefer the conservative
audit/line instead.
Required Inputs
Collect or infer:
- Competitor URL(s) and the target product / page scope.
- Existing codebase path, if rebuilding into a repo.
- Existing API docs, integration docs, schemas, or backend constraints.
- Differentiation preference: "workflow parity with original style", "same features with target design system", or "research only".
- Desired output depth: quick audit, implementation-ready audit, or PRD handoff.
If any input is unavailable, proceed with explicit assumptions and mark unknowns. Do not block unless the task requires authenticated data, paid access, or private target-system details that cannot be simulated safely. Without a target repo, integration docs, or API constraints, produce research and a gap plan only; do not claim the result is implementation-ready.
Tooling
Pick whatever is available; degrade gracefully and re-classify evidence accordingly.
-
Browser automation (click, screenshot, DOM dump, network capture): prefer a browser MCP (e.g. Chrome MCP, Playwright MCP, Claude Preview) or a headless runner. Without one, fall back to static fetch + HTML parse and mark every interaction
inferred. -
Network inspection: DevTools panel via the browser MCP, or
curl -vfor unauthenticated endpoints. -
Mobile: a mobile MCP / device farm for real screenshots; otherwise emulate via DevTools responsive mode and mark device-specific behavior
inferred. -
Static docs / API references:
WebFetchor a web-reader MCP. -
Default evidence directory and cross-audit screenshot reuse: root at
./audit/<site-slug>/(honor a user-provided path if given). Layout:audit/<site-slug>/ ├── MANIFEST.md # central index, one row per unique URL × viewport × auth ├── snapshots/<YYYY-MM-DD>/ # screenshots + DOM dumps captured that day ├── network/<YYYY-MM-DD>/ # network traces (time-sensitive — always fresh, never reused) └── reports/<YYYY-MM-DD>.md # the audit deliverableMANIFEST.mdis the single source of truth for "have I captured this before?". Format:| URL | Viewport | Auth | Last Captured | Snapshot | DOM | Notes | | --- | --- | --- | --- | --- | --- | --- | | https://example.com/dashboard | desktop-1440 | free | 2026-05-15 | snapshots/2026-05-15/dashboard.png | snapshots/2026-05-15/dashboard.html | post-redesign |Reuse rule (time-based, 30-day window): if a manifest entry exists and
Last Capturedis within 30 days, reuse the stored snapshot and DOM, and tag the evidence rowobserved (cached from <date>). Otherwise capture fresh and append / update the manifest. Force-fresh capture when the user explicitly asks or the cached snapshot is visibly stale against the live page. Network traces are never reused — auth class, rate-limit headers, and A/B bucketing are all time-sensitive.
Workflow
- Define scope and evidence
- List every competitor page, route, tab, mode, drawer, modal, and post-submit state in scope.
- Read
audit/<site-slug>/MANIFEST.mdfirst (format: references/manifest-template.md). For each in-scope URL × viewport × auth-state, look up the row. Cache hit (entry exists,Last Capturedwithin the cache window — default 30 days, see Configuration) → reuse the stored snapshot + DOM, tag the evidence rowobserved (cached from <date>), do not re-capture. Cache miss or stale → capture fresh and append / update the manifest row. CreateMANIFEST.mdwith the template header if it does not exist yet. - Split the page into named regions early (shell / sidebar, primary work area, secondary panel, bottom action rail or player, global overlays) and note how each region affects the others. Do this even if step 3 will formalize the model later.
- Capture desktop and mobile screenshots only for URLs that missed the cache. Prefer full-page screenshots plus focused component screenshots.
- Save redacted evidence: screenshots, control inventories (buttons / inputs / links), network calls, console errors, and a structural DOM snapshot. For the DOM snapshot pick one of: (a) the browser-MCP's built-in accessibility-tree tool (e.g.
take_snapshoton chro