Firecrawl CLI
Search, scrape, and interact with the web. Returns clean markdown optimized for LLM context windows.
Run firecrawl --help or firecrawl <command> --help for full option details.
If the task is to integrate Firecrawl into an application, add FIRECRAWL_API_KEY to a project, or choose endpoint usage in product code, use the firecrawl-build skills. If the task is an outcome workflow such as deep research, SEO audit, QA, lead generation, knowledge-base creation, dashboard reporting, shopping research, or website design-system extraction, use the firecrawl-workflows skills. They are already installed alongside this CLI skill when you run firecrawl init.
Prerequisites
Must be installed and authenticated. Check with firecrawl --status.
🔥 firecrawl cli v1.8.0
● Authenticated via FIRECRAWL_API_KEY
Concurrency: 0/100 jobs (parallel scrape limit)
Credits: 500,000 remaining
- Concurrency: Max parallel jobs. Run parallel operations up to this limit.
- Credits: Remaining API credits. Each operation consumes credits.
If not ready, see rules/install.md. For output handling guidelines, see rules/security.md.
Before doing real work, verify the setup with one small request:
mkdir -p .firecrawl
firecrawl scrape "https://firecrawl.dev" -o .firecrawl/install-check.md
firecrawl search "query" --scrape --limit 3
Workflow
Follow this escalation pattern:
- Search - No specific URL yet. Find pages, answer questions, discover sources.
- Scrape - Have a URL. Extract its content directly.
- Map + Scrape - Large site or need a specific subpage. Use
map --searchto find the right URL, then scrape it. - Crawl - Need bulk content from an entire site section (e.g., all /docs/).
- Interact - Scrape first, then interact with the page (pagination, modals, form submissions, multi-step navigation).
| Need | Command | When |
|---|---|---|
| Find pages on a topic | search | No specific URL yet |
| Get a page's content | scrape | Have a URL, page is static or JS-rendered |
| Find URLs within a site | map | Need to locate a specific subpage |
| Bulk extract a site section | crawl | Need many pages (e.g., all /docs/) |
| AI-powered data extraction | agent | Need structured data from complex sites |
| Interact with a page | scrape + interact | Content requires clicks, form fills, pagination, or login |
| Download a site to files | download | Save an entire site as local files |
| Parse a local file | parse | File on disk (PDF, DOCX, XLSX, etc.) — not a URL |
| Watch pages for changes | monitor | Schedule recurring scrapes/crawls, diff against snapshots |
For detailed command reference, run firecrawl <command> --help.
Scrape vs interact:
- Use
scrapefirst. It handles static pages and JS-rendered SPAs. - Use
scrape+interactwhen you need to interact with a page, such as clicking buttons, filling out forms, navigating through a complex site, infinite scroll, or when scrape fails to grab all the content you need. - Never use interact for web searches - use
searchinstead.
Monitor: Schedule recurring scrapes or crawls and diff each result against the last retained snapshot. Use for product pages, docs, blogs, changelogs, competitor sites — any page where changes matter. Each check labels pages as same, new, changed, removed, or error, with webhook and email notification options.
Subcommands: create | list | get | update | delete | run | checks | check.
# create from flags
firecrawl monitor create --name "Blog" --schedule "every 30 minutes" \
--scrape-urls https://example.com/blog --email alerts@example.com
# or from JSON (positional file, or piped stdin)
firecrawl monitor create monitor.json
cat monitor.json | firecrawl monitor create
firecrawl monitor list --limit 20
firecrawl monitor run <monitorId> # trigger a check now
firecrawl monitor checks <monitorId> # list checks
firecrawl monitor check <monitorId> <checkId> --page-status changed
firecrawl monitor update <monitorId> --state paused
firecrawl monitor delete <monitorId>
Schedules accept cron (--cron "*/30 * * * *") or natural language (--schedule "every 30 minutes"). Minimum interval is 15 minutes. Targets are either --scrape-urls a,b,c (scrape) or --crawl-url <url> (crawl whole site each check). Note: --state (not --status) sets active/paused; --page-status (not --status) filters page results on check — avoids collision with the global --status flag. Monitoring is not available for zero-data-retention teams.
JSON-mode change tracking: By default monitors diff each page's markdown and you get a unified text diff back. When you care about specific structured fields (price, headline, in-stock flag, items in a list) instead of the whole page, add a changeTracking format with modes: ["json"] and a JSON schema to the target's scrapeOptions.formats. The flag-based form doesn't cover this — pass a JSON body via file or stdin:
cat > pricing-monitor.json <<'EOF'
{
"name": "Pricing watch",
"schedule": { "text": "hourly", "timezone": "UTC" },
"targets": [{
"type": "scrape",
"urls": ["https://example.com/pricing"],
"scrapeOptions": {
"formats": [{
"type": "changeTracking",
"modes": ["json"],
"prompt": "Extract pricing tiers and headline features for each plan.",
"schema": {
"type": "object",
"properties": {
"plans": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "string" },
"features": { "type": "array", "items": { "type": "string" } }
}
}
}
}
}
}]
}
}]
}
EOF
firecrawl monitor create pricing-monitor.json
The check response then carries a per-field diff (paths like plans[0].price) and the full extraction at this run, instead of (or in addition to) a markdown diff. Each changed page in pages[] looks like:
{
"url": "https://example.com/pricing",
"status": "changed",
"diff": {
"json": {
"plans[0].price": { "previous": "$19/mo", "current": "$24/mo" },
"plans[1].features[2]": {
"previous": "10 GB storage",
"current": "25 GB storage"
}
}
},
"snapshot": {
"json": {
"plans": [
/* current full extraction */
]
}
}
}
Use modes: ["json", "git-diff"] for mixed mode: you get both diff.json (per-field) and diff.text (markdown sidecar), and the page is marked changed whenever either surface changed. For markdown-only monitors, diff.text holds the unified diff and diff.json is a parse-diff AST ({ files: [...] }); there is no snapshot.
Avoid redundant fetches:
search --scrapealready fetches full page content. Don't re-scrape those URLs.- Check
.firecrawl/for existing data before fetching again.
When to Load References
- Searching the web or finding sources first -> firecrawl-search
- Scraping a known URL -> firecrawl-scrape
- Finding URLs on a known site -> firecrawl-map
- **Bulk extraction from a d