opencli-browser
The first reader of this CLI is an agent, not a human. Every subcommand returns a structured envelope that tells you exactly what matched, how confident the match is, and what to do if it didn't. Lean on those envelopes — do not guess.
This skill is for driving a live browser to accomplish an agent task. If you are building a reusable adapter under ~/.opencli/clis/<site>/ use opencli-adapter-author instead.
Prerequisites
opencli doctor
Until doctor is green, nothing else will work. Typical failures: Chrome not running, extension not installed, debug port blocked by 1Password / other extensions. The doctor output tells you which.
Session lifecycle
opencli browser *commands require a<session>positional immediately afterbrowser. Use the same session name for a multi-step flow; use a different name to isolate parallel browser work.- Use a stable session name for any multi-command or human-paced browser workflow. Example:
opencli browser fb-yaya-warmup open https://example.com, then reuseopencli browser fb-yaya-warmup state,extract,click, etc. - Owned browser sessions keep a tab lease alive between calls. Release it with
opencli browser <session> closeor let the idle timeout expire. opencli browser <session> bindbinds the Chrome tab you already have open to that session. Use this for logged-in pages, SSO flows, or pages you manually positioned before handing control to the agent.--window foreground|background(orOPENCLI_WINDOW=foreground|background) chooses whether OpenCLI creates/focuses a foreground browser window or uses a background browser window for owned sessions.
Bind Tab
opencli browser gmail bind
opencli browser gmail state
opencli browser gmail click "Search"
opencli browser gmail network
opencli browser gmail unbind
Binding never owns the user window and never closes the user tab. It fails closed if the tab is closed or becomes non-debuggable. Re-run opencli browser <session> bind when you switch to a different real tab.
Navigation is allowed on bound sessions because the session now represents explicit agent ownership of that tab. Tab mutation (tab new, tab select, tab close) is still blocked for bound sessions. Use an owned session when you want OpenCLI to manage tab lifecycle.
Bound sessions have no OpenCLI idle-close timer; the binding lasts until unbind, tab close, window close, or daemon restart.
Mental model
- Selector-first target contract. Every interaction command (
click,type,select,get text/value/attributes) takes one<target>, which is either a numeric ref fromstate/findor a CSS selector. Use--nth <n>to disambiguate multiple CSS matches. - Every envelope reports
matches_nandmatch_level.match_levelisexact,stable, orreidentified— the CLI already rescued moderate DOM drift for you, but the level tells you how confident to be. - Compact output first, full payload on demand.
stateis a budget-aware snapshot;get html --as jsonsupports--depth/--children-max/--text-max;networkreturns shape previews and you re-fetch a single body with--detail <key>. If you emit a giant payload you are burning context you did not need to burn. - Structured errors are machine-readable. On failure the CLI emits
{error: {code, message, hint?, candidates?}}. Branch oncode, not on message strings.
Critical rules
- Always inspect before you act. Run
stateorfindfirst. Never hard-code a ref or selector from memory across sessions — indices are per-snapshot. - Prefer site adapters before raw browser driving. If
opencli <site> <command>already covers the task, use that adapter command first (opencli facebook notifications,opencli reddit read, etc.). Useopencli browser ...only for gaps, debugging, or one-off UI flows the adapter does not expose. - Prefer numeric ref over CSS once you have it. Numeric refs survive mild DOM shifts because the CLI fingerprints each tagged element. A CSS selector written by hand will break the first time the site re-renders.
- Read
match_levelafter every write.exact= all good.stable= the element is the same but some soft attrs drifted — your action still applied.reidentified= the original ref was gone and the CLI found a unique replacement; double-check you hit the right element. - Use the
compoundfield for form controls. Do not regex-guess a date format, do notstatetwice to get the full<select>options list. The compound envelope has the format string, full option list up to 50,options_totalfor overflow, andaccept/multiplefor<input type=file>. - Verify writes that matter. After
type <target> <text>, runget value <target>. Afterselect, runget value. Autocomplete widgets, React controlled inputs, and masked fields all silently eat characters. The CLI cannot detect this for you. state→ action →stateafter a page change. Navigations, form submits, and SPA route changes invalidate refs. Take a fresh snapshot. Do not reuse refs from before the transition.- Chain with
&&when reusing freshly parsed refs. A chained sequence runs in one shell so the ref you just read from output can be passed directly to the next command. Separate shell invocations keep the named browser session, but any shell-local variables or copied refs from the previous command can go stale after page changes. evalis read-only. Wrap the JS in an IIFE and return JSON. If you need to change the page, use the structuredclick/type/select/keyscommands instead — they produce structured output and fingerprints,evaldoes not.- Prefer
networkto screen-scraping. If a page you care about fetches its data from a JSON API, the API is almost always more reliable than scraping the rendered DOM. Capture once, inspect the shape, then--detail <key>the body you need.
Target contract (<target> for click / type / select / get text|value|attributes)
<target> ::= <numeric-ref> | <css-selector>
- Numeric ref — the
[N]index fromstateorfind. Cheap, resilient to soft DOM drift. - CSS selector — anything
querySelectorAllaccepts. Must be unambiguous on write ops, or pair with--nth <n>.
Envelope on success
{ "clicked": true, "target": "3", "matches_n": 1, "match_level": "exact" }
{ "value": "kalevin@example.com", "matches_n": 1, "match_level": "stable" }
match_level
| level | meaning | you should |
|---|---|---|
exact | Fingerprint agreed on tag + strong IDs with at most one soft drift | Proceed. |
stable | Tag + strong IDs still agree, soft signals (aria-label, role, text) drifted | Proceed, but if what you typed/clicked matters, re-check with get value or state. |
reidentified | Original ref was gone; a unique live element matched the fingerprint and was re-tagged with the old ref | Double-check you hit the right element before chaining more writes. |
Structured error codes
Branch on these, not on the human message:
| code | meaning |
|---|---|
not_found | Numeric ref is no longer in the DOM. Re-state. |
stale_ref | Ref exists but the element at that ref changed identity. Re-state. |
invalid_selector | CSS was rejected by querySelectorAll. Fix the selector. |
selector_not_found | CSS matches 0 elements. Try find with a looser selector. |
selector_ambiguous | CSS matches >1 and no --nth. Add --nth or narrow the selector. |
selector_nth_out_of_range | --nth beyond match count. |
option_not_found | select couldn't find an option matching that label/value. Error envelope includes available: string[] of the real option labels. |
not_a_select | select was called on a non-<select> element. |
Error envelope always includes `