E-commerce — Product Detail
Product URL / keyword / SKU → complete product data (name, price, brand, images, identifiers, availability, rating, variants)
Language
All process output to user (progress updates, process notifications) follows the user's language.
Objective
Extract complete product information from any publicly accessible e-commerce product page using a universal multi-layer extraction strategy (JSON-LD → platform-specific DOM → OG meta → microdata).
Prerequisites
- Target browser is open and connected
- No login required for public product pages
Pre-execution Checks
1. Tool Readiness
If browser-act has been confirmed available in the current session → skip this step.
Invoke browser-act via Skill tool to load usage. If installation or configuration issues arise, follow its guidance to resolve then retry.
Capability Components
This Skill's operational boundary = what the user can manually do in their browser. It only reads data already displayed to the user on the page, never bypassing authentication or access controls. JS code is encapsulated in Python files under the
scripts/directory, invoked viaeval "$(python scripts/xxx.py {params})". Use the bash tool for execution.
DOM: Extract product data from current product page
Navigate to the product URL first, then extract:
eval "$(python scripts/extract-product.py)"
Output example:
{
"url": "https://www.amazon.com/dp/B09WNK39JN",
"name": "Amazon Echo Pop",
"price": 39.99,
"price_currency": "USD",
"brand": "Amazon",
"image": "https://m.media-amazon.com/images/I/61bTwy0ooPL.jpg",
"images": ["https://...jpg", "https://...jpg"],
"description": "Compact smart speaker with Alexa...",
"category": ["Electronics", "Smart Speakers"],
"sku": "B09WNK39JN",
"gtin": null,
"mpn": null,
"availability": "InStock",
"rating": 4.7,
"review_count": 103789,
"variants": [{"name": "Charcoal", "sku": "B09WNK39JN", "price": 39.99}],
"seller": "Amazon",
"identifiers": {"ASIN": "B09WNK39JN", "Best Sellers Rank": "#1 in Smart Speakers"},
"_platform": "amazon",
"_source": "json-ld"
}
Composite: Keyword or SKU → product detail
When input is a keyword, ASIN/SKU, or EAN/UPC rather than a direct product URL:
Step 1 — Navigate to search URL based on input type:
| Input type | Target site | URL pattern |
|---|---|---|
| ASIN (10-char alphanumeric) | Amazon | https://www.amazon.com/dp/{ASIN} |
| Keyword | Amazon | https://www.amazon.com/s?k={keyword_urlencoded} |
| Keyword | eBay | https://www.ebay.com/sch/i.html?_nkw={keyword_urlencoded} |
| Keyword | Walmart | https://www.walmart.com/search?q={keyword_urlencoded} |
Keyword + --site specified | Any site | https://{site}/search?q={keyword_urlencoded} |
| Keyword (no site) | Cross-site | https://www.google.com/search?tbm=shop&q={keyword_urlencoded} |
| EAN / UPC / GTIN | Cross-site | https://www.google.com/search?tbm=shop&q={identifier} |
Step 2 — If landed on a search/listing page (multiple results):
wait stableeval "$(python scripts/extract-listing.py --max-results 3)"— get top 3 results- Pick the most relevant product URL from
items[0].url navigate {product_url}→wait stable
Step 3 — Extract product data:
eval "$(python scripts/extract-product.py)"
Note: scripts/extract-listing.py is located in ../ecommerce-listing/scripts/extract-listing.py if used as a standalone Skill install; otherwise reference the listing Skill.
Success Criteria
result.name != null AND (result.price != null OR result.availability != null)
Known Limitations
- Amazon bot detection: direct navigation to a product URL may redirect to a CAPTCHA or bot-check page on fresh sessions. Navigate from
https://www.amazon.comfirst to establish session cookies, then navigate to the product page - eBay product pages may require navigating from
https://www.ebay.comfirst; usesolve-captchaif a challenge appears - Some sites render product data entirely via client-side JavaScript; always use
wait stablebefore extracting - Price may be null for out-of-stock items or when login is required to view pricing
- Variant data completeness depends on whether the site includes full variant markup in JSON-LD
Execution Efficiency
- Batch orchestration: Write a bash script to loop through product URLs serially within a single session; add 1–2 second intervals between requests to avoid triggering anti-scraping restrictions
- Test before batch execution: Test with 1–2 URLs before running the full batch
- Error resumption: Save results item by item; on failure, resume from the breakpoint rather than starting over
Experience Notes
Path: {working-directory}/browser-act-skill-forge-memories/ecommerce-scraper-ecommerce-product-detail.memory.md
Before execution: If the file exists, read it first — it records unexpected situations encountered during past executions (e.g., a strategy has become ineffective); adjust strategy order accordingly.
After execution: If an unexpected situation is encountered (strategy became ineffective, page redesigned, anti-scraping upgraded, better path discovered), append a line:
{YYYY-MM-DD}: {what happened} → {conclusion}