find-service-providers
Drive the ServiceGraph API (https://api.servicegraph.co) to find,
shortlist, and enrich US professional-services firms.
The API hosts several datasets behind a uniform per-dataset URL
shape (/v1/datasets/:id/…). This skill is for the agencies dataset —
dataset id pro_services — which holds 100k+ B2B service firms
classified across 22 industries with multi-tag service taxonomies,
location, size, and third-party rating signals.
Any HTTP client works (curl, fetch, requests). Examples below use curl.
MCP server (preferred for authed calls)
If your agent harness has the ServiceGraph MCP server loaded
(https://mcp.servicegraph.co) — recognizable by tool names
containing servicegraph — prefer those tools over raw HTTP. The
MCP server uses OAuth 2.1 + PKCE so the harness handles credentials
in its own sandbox and no token value ever enters the LLM context.
Otherwise, fall through to the REST flow described below.
API surface
Every endpoint requires the bearer (Authorization: Bearer vk_…).
There is no anonymous tier.
| Endpoint | Cost | Use it for |
|---|---|---|
GET /v1/datasets | free | Discover available datasets. |
GET /v1/datasets/pro_services | free | Full schema for this dataset (brief vs detail fields, allowed filters, unlock price, TTL). |
GET /v1/datasets/pro_services/fields[?include_values=1&q=] | free | Filter-field catalog + DSL grammar. Call this first per session. |
GET /v1/datasets/pro_services/values/:field[?q=&limit=&offset=] | free | Enumerate values for one field (e.g. legal industry / state / service_provided values). |
GET /v1/datasets/pro_services/check?filter=… | free | Validate a filter string. Returns {valid, normalized} or {valid:false, error}. |
POST /v1/datasets/pro_services/translate-intent | free | Body {intent, model?}. LLM-translates plain English → DSL filter + sanity-check row count. |
GET /v1/datasets/pro_services/search?filter=…&limit=&offset= | free | Brief firm cards + per-row unlock hint. No url, no phone, no email. |
GET /v1/datasets/pro_services/:apex | free | One row: always brief; detail block only if caller has an active unlock for (user, dataset, apex). Idempotent, never charges. |
POST /v1/datasets/pro_services/unlocks | 10 credits / firm | Body {apexes: [...]}, max 100. Atomic batch — either all uncached apexes unlock, or none do (402 if balance short). Already-unlocked rows return was_cached:true with no extra charge. Detail TTL: 30 days. Returns brief + detail + per-item billing. |
GET /v1/me/credits | free | Current credit balance. |
GET /v1/me/credits/transactions[?limit=&offset=] | free | Spend history; unlock rows carry (dataset_id, apex, expires_at). |
Cost model in one paragraph. Discovery, validation, search, and
brief reads are free. Detail data (apex, full url, phone, email,
social, address, legal name, platforms map) costs 10 credits per
firm and lasts 30 days. Re-fetching an unlocked firm within the
TTL is free — both the detail GET and the unlock POST honor the
cache. Charges are atomic per POST /unlocks call: a 402 leaves
balance untouched.
Auth
Tokens are vk_* API keys minted in the dashboard. The user creates
them themselves; this skill never sees raw email/password.
Security model — keep the token out of the LLM context.
- Never read
.env,.env.local, or any other credential file into your context. The token's literal value must never appear in the conversation. - Every authed call goes through a shell wrapper so the token flows
from the user's environment / dotenv file into the
Authorizationheader without round-tripping through the LLM.
First-call resolution:
-
Just try the call. Dispatch via shell, sourcing
.env.localif present:( set -a; [ -f .env.local ] && . ./.env.local; set +a; curl -sS -H "Authorization: Bearer $SERVICEGRAPH_API_KEY" \ 'https://api.servicegraph.co/v1/datasets/pro_services/fields' ) -
On
401 unauthorized, prompt the user (don't accept the key in chat):"I need a ServiceGraph API key. Open https://servicegraph.co/profile/api-keys, sign in, click Create key, and copy the
vk_…value.Then either export it in your shell —
export SERVICEGRAPH_API_KEY=vk_…— or add the lineSERVICEGRAPH_API_KEY=vk_…to.env.localin this directory. Tell me when done and I'll retry. Please don't paste the key into chat — keep it out of the LLM context." -
After the user signals ready, re-dispatch the same call. If a later call returns
401, the key was revoked or rotated — re-prompt.
For the user's convenience: if SERVICEGRAPH_API_KEY is already set
or already in .env.local, the very first call will succeed and the
prompt step never happens.
Filter DSL
One query parameter, GitHub-search-style.
filter := orExpr
orExpr := andExpr ("OR" andExpr)*
andExpr := notExpr (("AND")? notExpr)* # whitespace = implicit AND
notExpr := ("NOT" | "-") notExpr | atom
atom := "(" filter ")" | predicate
predicate:= IDENT op valueOrList | bareword
op := ":" | "=" | ">=" | "<=" | ">" | "<"
valueOrList := value ("," value)*
value := IDENT | NUMBER | tagAtEvidence
tagAtEvidence := IDENT "@" ("low"|"medium"|"high")
bareword := IDENT | NUMBER # → keyword:<bareword>
Four rules that bite:
- AND binds tighter than OR.
a OR b cparses asa OR (b AND c). Use parens. - Comma list = OR within one predicate.
state:CA,NY,TXmatches any of the three. - Negation is
-xorNOT x. Negative literals inside a comma list are not allowed:state:CA,-NYis rejected. Usestate:CA -state:NY. - Bareword = keyword search. Any IDENT or NUMBER not followed by
an operator becomes a free-text substring across name / brand /
title / meta / legal_name. Multiple barewords AND. Wrap multi-word
phrases in double quotes:
keyword:"foo bar". Punctuation (& ' . ; ! ? * /etc.) is silently dropped outside quotes, and stray commas are treated as ANDs — so paste-friendly inputs likeCox, Castle & Nicholsonwork without quoting.
Field kinds you'll use most:
- categorical —
industry,state,service_model,geography_served,company_size_signal,pricing_model— op:only. - tag_set_with_evidence —
service_provided—Map<tag, evidence∈{low,medium,high}>. Op:with optional@evidence. - numeric —
rating,review_count_total,founded_year,linkedin_employees, etc. — ops=,>=,<=,>,<. - presence —
has:phone,has:clutch,has:rating,has:linkedin_company, etc. - keyword — any bareword in the filter becomes a free-text substring across name / brand / title / meta / legal_name / linkedin company text.
Examples (validate yours with /check):
industry:marketing_agency service_provided:seo
dental industry:marketing_agency
industry:legal state:CA,NY -company_size_signal:solo
industry:management_consulting (service_provided:strategy-consulting@high OR service_provided:operations-consulting@high)
state:CA has:phone has:email
rating>=4 review_count_total>=20 has:clutch
industry:it_services NOT (service_provided:web-development OR service_provided:hosting)
"Cox, Castle & Nicholson"
Don't put kind: in the filter — the dataset URL is authoritative and
the API will reject it. Don't use fields outside this dataset's allowed
list either; /check will tell you which ones.
Identifying firms — apex
Firms are identified by their apex domain (registered domain only:
mckinsey.com, not www.mckinsey.com/about). When the user gives you
URLs, strip to the apex before calling /datasets/pro_services/:apex
or POST /unlocks. The endpoint accepts any lowercase host-shaped
string; a 404 means the firm isn't in this dataset (no charge).