Browse: Token-Efficient Web Browsing
Guide AI agents through web browsing tasks using the cheapest tool that gets the job done. Every browsing action has a token cost - this skill minimizes it through progressive disclosure, smart format selection, and backend-aware strategies.
Target versions (May 2026):
- Lightpanda: 0.3.1
- @playwright/mcp: 0.0.75
- agent-browser: 0.27.0
When to use
- Reading a web page, article, or documentation site
- Extracting structured data from a page (prices, tables, metadata)
- Filling forms, clicking buttons, or navigating multi-step flows
- Scraping content from JavaScript-rendered pages (SPAs)
- Browsing behind authentication (login flows)
- Any task where the agent needs to see or interact with a live web page
When NOT to use
- E2E test automation, test writing, or test debugging - use testing
- Building or debugging MCP servers (including browser MCP servers) - use mcp
- Network configuration, DNS, reverse proxies - use networking
- Fetching API endpoints or REST calls - use curl/fetch directly
- Static file downloads - use curl or wget
- Web scraping specifically for RAG pipelines or training data - use ai-ml for the pipeline
AI Self-Check
Before returning any browsing result, verify:
- Used the cheapest tool available for the task - no Playwright when WebFetch would have worked
- Did not dump full HTML into context when markdown or structured data was sufficient
- Waited for dynamic content before extracting from SPAs (
networkidleor--wait-selector) - Stripped boilerplate (nav, ads, footers) before returning content to the user
- Scoped extraction to the relevant section, not the whole page
- Did not hardcode credentials - used env vars, secret manager, or user prompt
- Cleared or isolated cookies/storage between unrelated accounts or tenants
- Used semantic roles before CSS selectors for interaction targets
- Used screenshots only when visual layout or rendered state mattered
- Recorded URL and access date for facts likely to change
- Did not automate destructive account actions unless the user named the exact action and target
- Re-extracted page state after any click or form submission before making decisions
- Escalated to the next tool tier on failure rather than retrying the same tool
- Current source checked: dated versions, CLI flags, API names, and support windows are verified against primary docs before repeating them
- Hidden state identified: local config, credentials, caches, contexts, branches, cluster targets, or previous runs are made explicit before acting
- Verification is real: final checks exercise the actual runtime, parser, service, or integration point instead of only linting prose or happy paths
- Routing overlap checked: overlapping skills, trigger terms, and "When NOT to use" boundaries are checked before returning guidance
- Spec claims verified: claims about tool behavior, output contracts, or repo conventions are checked against current docs, scripts, or skill files
- Robots and terms considered: scraping or automation respects access rules, auth boundaries, and rate limits
- Dynamic content verified: browser-rendered pages are checked with the real tool when static HTML may be incomplete
Tool Selection
Detect what's available and pick the cheapest tool that handles the task.
Detection
- MCP browsing tools: look for
goto,navigate,markdown,semantic_tree,browser_navigate,browser_snapshotin the available tool list - CLI tools: check
lightpanda,agent-browserin PATH - Built-in fetch: WebFetch tool (Claude Code) or platform equivalent
- Fallback: curl via shell
Decision matrix
| Task | No JS needed | JS needed, read-only | JS needed, interactive |
|---|---|---|---|
| Best | WebFetch / curl | Lightpanda fetch | Lightpanda MCP tools |
| Good | Lightpanda fetch | MCP markdown tool | agent-browser CLI |
| Fallback | curl | Playwright MCP | Playwright MCP |
If the page works without JavaScript, don't use a browser. If you only need to read content, don't use interactive tools. Escalate only when the cheaper option fails.
Task modes
| Mode | Use when | First tool |
|---|---|---|
| Static fetch | Content is in HTML | WebFetch, curl, or Lightpanda fetch |
| JS read-only | Content requires rendering | Lightpanda or Playwright markdown |
| Interactive | Click, fill, or navigate statefully | MCP browser tools |
| Authenticated | User-approved account context is required | Existing authenticated browser session |
| Screenshot | Visual layout matters | Browser screenshot after DOM snapshot |
| Structured extraction | Tables, JSON-LD, prices, metadata | API, JSON-LD, DOM selectors, then browser |
Tool availability check: before starting, verify what's available. If the best tool for the task isn't present, skip straight to the next tier rather than failing mid-workflow.
Performance
- Prefer official APIs, sitemaps, or static fetches before launching a browser.
- Extract only required page regions; avoid dumping full DOMs, screenshots, or network logs into context.
- Reuse browser sessions for multi-step flows, but clear cookies/storage between unrelated accounts or tenants.
- Use screenshots only when visual layout or rendered state matters.
Best Practices
- Use stable selectors and semantic roles before brittle CSS paths.
- Record URL and access date for facts likely to change.
- Clear cookies/storage between unrelated accounts or tenants.
- Do not automate destructive account actions unless the user names the exact action and target.
Workflow
Step 1: Assess the task
Before touching any tool, answer these in order - each answer narrows the tool choice:
- Read or interact? Read-only -> skip to Step 2. Interactive -> go to Step 3/4.
- Static or dynamic? View page source or check URL patterns - if the content is in the HTML, it's static. SPA frameworks (React, Vue, Angular) need JS rendering.
- Single page or multi-step? Multi-step flows need session persistence (MCP or serve mode).
- What output format? Markdown for human reading, structured data / JSON-LD for extraction, semantic tree for element discovery, links for crawl planning.
Step 2: Try the cheapest path first
Static content (docs, articles, blogs):
# Option A: built-in fetch (lowest overhead, no setup)
# Use WebFetch tool with the URL directly
# Option B: Lightpanda CLI (better stripping, selector waits)
lightpanda fetch --dump markdown --strip-mode full <url>
# Option C: curl (always available, raw HTML only)
curl -sL <url>
JS-rendered content (SPAs, dashboards):
# Lightpanda CLI with wait
lightpanda fetch --dump markdown --strip-mode full --wait-until networkidle <url>
# With selector wait for specific content
lightpanda fetch --dump markdown --wait-selector ".main-content" <url>
Interactive tasks (forms, clicks, multi-step): use MCP tools or agent-browser CLI (Step 3).
Step 3: Navigate and extract
With MCP browsing tools (Lightpanda MCP or Playwright MCP):
Navigate first, then extract using the cheapest format:
| Format | Tool | Tokens (typical) | Use when |
|---|---|---|---|
| Semantic tree | semantic_tree / browser_snapshot | ~200-500 | Finding elements to interact with |
| Markdown | markdown | ~500-2000 | Reading text content |
| Links only | links | ~100-300 | Finding URLs to follow |
| Structured data | structuredData / structured_data | ~100-500 | Getting metadata (OpenGraph, JSON-LD) |
| Interactive elements | interactiveElements / interactive_elements | ~200-400 | Finding clickable/fillable elements |
| Full HTML | page-html resource | ~5000-50000 | Last resort only |
**Foll