Firecrawl CLI

Search, scrape, and interact with the web. Returns clean markdown optimized for LLM context windows.

Run firecrawl --help or firecrawl <command> --help for full option details.

If the task is to integrate Firecrawl into an application, add FIRECRAWL_API_KEY to a project, or choose endpoint usage in product code, use the firecrawl-build skills. If the task is an outcome workflow such as deep research, SEO audit, QA, lead generation, knowledge-base creation, dashboard reporting, shopping research, or website design-system extraction, use the firecrawl-workflows skills. They are already installed alongside this CLI skill when you run firecrawl init.

Prerequisites

Must be installed and authenticated. Check with firecrawl --status.

  🔥 firecrawl cli v1.8.0

  ● Authenticated via FIRECRAWL_API_KEY
  Concurrency: 0/100 jobs (parallel scrape limit)
  Credits: 500,000 remaining

Concurrency: Max parallel jobs. Run parallel operations up to this limit.
Credits: Remaining API credits. Each operation consumes credits.

If not ready, see rules/install.md. For output handling guidelines, see rules/security.md.

Before doing real work, verify the setup with one small request:

mkdir -p .firecrawl
firecrawl scrape "https://firecrawl.dev" -o .firecrawl/install-check.md

firecrawl search "query" --scrape --limit 3

Workflow

Follow this escalation pattern:

Search - No specific URL yet. Find pages, answer questions, discover sources.
Scrape - Have a URL. Extract its content directly.
Map + Scrape - Large site or need a specific subpage. Use map --search to find the right URL, then scrape it.
Crawl - Need bulk content from an entire site section (e.g., all /docs/).
Interact - Scrape first, then interact with the page (pagination, modals, form submissions, multi-step navigation).

Need	Command	When
Find pages on a topic	`search`	No specific URL yet
Get a page's content	`scrape`	Have a URL, page is static or JS-rendered
Find URLs within a site	`map`	Need to locate a specific subpage
Bulk extract a site section	`crawl`	Need many pages (e.g., all /docs/)
AI-powered data extraction	`agent`	Need structured data from complex sites
Interact with a page	`scrape` + `interact`	Content requires clicks, form fills, pagination, or login
Download a site to files	`download`	Save an entire site as local files
Parse a local file	`parse`	File on disk (PDF, DOCX, XLSX, etc.) — not a URL
Watch pages for changes	`monitor`	Schedule recurring scrapes/crawls, diff against snapshots

For detailed command reference, run firecrawl <command> --help.

Scrape vs interact:

Use scrape first. It handles static pages and JS-rendered SPAs.
Use scrape + interact when you need to interact with a page, such as clicking buttons, filling out forms, navigating through a complex site, infinite scroll, or when scrape fails to grab all the content you need.
Never use interact for web searches - use search instead.

Monitor: Schedule recurring scrapes or crawls and diff each result against the last retained snapshot. Use for product pages, docs, blogs, changelogs, competitor sites — any page where changes matter. Each check labels pages as same, new, changed, removed, or error, with webhook and email notification options.

# create from flags
firecrawl monitor create --name "Blog" --schedule "every 30 minutes" \
  --scrape-urls https://example.com/blog --email alerts@example.com

# or from JSON (positional file, or piped stdin)
firecrawl monitor create monitor.json
cat monitor.json | firecrawl monitor create

firecrawl monitor list --limit 20
firecrawl monitor run <monitorId>             # trigger a check now
firecrawl monitor checks <monitorId>          # list checks
firecrawl monitor check <monitorId> <checkId> --page-status changed
firecrawl monitor update <monitorId> --state paused
firecrawl monitor delete <monitorId>

Schedules accept cron (--cron "*/30 * * * *") or natural language (--schedule "every 30 minutes"). Minimum interval is 15 minutes. Targets are either --scrape-urls a,b,c (scrape) or --crawl-url <url> (crawl whole site each check). Note: --state (not --status) sets active/paused; --page-status (not --status) filters page results on check — avoids collision with the global --status flag. Monitoring is not available for zero-data-retention teams.

JSON-mode change tracking: By default monitors diff each page's markdown and you get a unified text diff back. When you care about specific structured fields (price, headline, in-stock flag, items in a list) instead of the whole page, add a changeTracking format with modes: ["json"] and a JSON schema to the target's scrapeOptions.formats. The flag-based form doesn't cover this — pass a JSON body via file or stdin:

cat > pricing-monitor.json <<'EOF'
{
  "name": "Pricing watch",
  "schedule": { "text": "hourly", "timezone": "UTC" },
  "targets": [{
    "type": "scrape",
    "urls": ["https://example.com/pricing"],
    "scrapeOptions": {
      "formats": [{
        "type": "changeTracking",
        "modes": ["json"],
        "prompt": "Extract pricing tiers and headline features for each plan.",
        "schema": {
          "type": "object",
          "properties": {
            "plans": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "name":     { "type": "string" },
                  "price":    { "type": "string" },
                  "features": { "type": "array", "items": { "type": "string" } }
                }
              }
            }
          }
        }
      }]
    }
  }]
}
EOF
firecrawl monitor create pricing-monitor.json

The check response then carries a per-field diff (paths like plans[0].price) and the full extraction at this run, instead of (or in addition to) a markdown diff. Each changed page in pages[] looks like:

{
  "url": "https://example.com/pricing",
  "status": "changed",
  "diff": {
    "json": {
      "plans[0].price": { "previous": "$19/mo", "current": "$24/mo" },
      "plans[1].features[2]": {
        "previous": "10 GB storage",
        "current": "25 GB storage"
      }
    }
  },
  "snapshot": {
    "json": {
      "plans": [
        /* current full extraction */
      ]
    }
  }
}

Use modes: ["json", "git-diff"] for mixed mode: you get both diff.json (per-field) and diff.text (markdown sidecar), and the page is marked changed whenever either surface changed. For markdown-only monitors, diff.text holds the unified diff and diff.json is a parse-diff AST ({ files: [...] }); there is no snapshot.

Avoid redundant fetches:

search --scrape already fetches full page content. Don't re-scrape those URLs.
Check .firecrawl/ for existing data before fetching again.

When to Load References

Searching the web or finding sources first -> firecrawl-search
Scraping a known URL -> firecrawl-scrape
Finding URLs on a known site -> firecrawl-map
**Bulk extraction from a d

firecrawl

Como adicionar

Cole no README do seu repo

Skills relacionadas

understand-dashboard

understand-chat

understand-domain

dev-browser

Receba novas skills de Pesquisa e Web toda segunda

Firecrawl CLI

Prerequisites

Workflow

When to Load References

Comentários · Nenhum comentário