find-ai-consultancy
Drive the ServiceGraph API (https://api.servicegraph.co) to find,
shortlist, and enrich US AI/ML and data consultancies via the
pro_services dataset. The catalog tags firms with
industry:data_ai_consulting and a 4-tag service sub-taxonomy:
ai-ml-development (the largest at ~12k firms), data-analytics,
cloud-services, and api-integration. There is no
data-engineering or business-intelligence sub-tag —
data-analytics covers both. Confirm exact tag names via
/v1/datasets/pro_services/fields?include_values=1.
Always pin industry:data_ai_consulting. This skill exists to
do that automatically — the user shouldn't have to think about catalog
taxonomy.
Any HTTP client works (curl, fetch, requests). Examples below use curl.
Sibling skills — defer when scope is different
- General application or backend dev that uses AI as a feature
(e.g. "build us a SaaS with an AI chatbot tab") →
find-software-developer. - Web/site projects that include some AI →
find-web-developer. - AI-related marketing or content →
find-marketing-agency.
This skill is for engagements where the AI/ML/data work IS the deliverable.
When NOT to use this skill
- Consumer AI courses or learning ("find me an online course to learn ML").
- AI/LLM product comparisons ("ChatGPT vs Claude vs Gemini", "Cursor vs Copilot").
- DIY/code tasks ("how do I fine-tune Llama", "review this PyTorch loop").
- In-house ML/data hires (Machine Learning Engineer, Data Scientist).
- Generic AI knowledge questions.
- Non-US firms / individual freelance ML engineers.
MCP server (preferred for authed calls)
If your harness has the ServiceGraph MCP server loaded (tools
containing servicegraph), prefer those — OAuth 2.1 + PKCE keeps the
token in the harness sandbox. Otherwise use the REST flow below.
API surface (dataset id: pro_services)
Every endpoint requires the bearer (Authorization: Bearer vk_…).
No anonymous tier.
| Endpoint | Cost | Use it for |
|---|---|---|
GET /v1/datasets/pro_services/fields[?include_values=1] | free | Confirm data_ai_consulting industry value and sub-tag names. |
GET /v1/datasets/pro_services/check?filter=… | free | Validate filter. |
POST /v1/datasets/pro_services/translate-intent | free | {intent} → DSL filter + sanity count. |
GET /v1/datasets/pro_services/search?filter=…&limit= | free | Brief firm cards + per-row unlock hint + total. |
GET /v1/datasets/pro_services/:apex | free | One row brief; detail only if unlocked. |
POST /v1/datasets/pro_services/unlocks | 10 credits / firm | {apexes:[...]} ≤100; atomic; 30-day TTL on detail. |
GET /v1/me/credits | free | Balance. |
Cost model. Discovery / validation / search / brief reads are
free. Detail (url, phone, email, social, address, full platforms
map) costs 10 credits per firm and lasts 30 days.
Auth
vk_* API keys minted in the dashboard. Keep the token out of the
LLM context — never read .env* into your context; dispatch via
shell.
-
Try the call first through a shell wrapper that sources
.env.local:( set -a; [ -f .env.local ] && . ./.env.local; set +a; curl -sS -H "Authorization: Bearer $SERVICEGRAPH_API_KEY" \ 'https://api.servicegraph.co/v1/datasets/pro_services/fields' ) -
On
401prompt the user (don't accept the key in chat):"Open https://servicegraph.co/profile/api-keys, create a key, and add
SERVICEGRAPH_API_KEY=vk_…to.env.localhere (or export it). Tell me when done. Please don't paste the key into chat." -
Retry after the user signals ready.
Filter DSL
GitHub-search-style.
filter := orExpr
orExpr := andExpr ("OR" andExpr)*
andExpr := notExpr (("AND")? notExpr)* # whitespace = implicit AND
notExpr := ("NOT" | "-") notExpr | atom
atom := "(" filter ")" | predicate
predicate:= IDENT op valueOrList | bareword
op := ":" | "=" | ">=" | "<=" | ">" | "<"
valueOrList := value ("," value)*
value := IDENT | NUMBER | tagAtEvidence
tagAtEvidence := IDENT "@" ("low"|"medium"|"high")
bareword := IDENT | NUMBER # → keyword:<bareword>
Four rules that bite: AND binds tighter than OR (use parens);
comma list = OR within one predicate; negation is -x or NOT x;
bareword = keyword search (quote multi-word phrases).
AI-flavored examples (validate yours with /check):
industry:data_ai_consulting service_provided:ai-ml-development
industry:data_ai_consulting service_provided:ai-ml-development@high state:CA
industry:data_ai_consulting service_provided:data-analytics pipelines
industry:data_ai_consulting llm rag
industry:data_ai_consulting "computer vision" healthcare
industry:data_ai_consulting mlops
industry:data_ai_consulting (service_provided:ai-ml-development OR service_provided:data-analytics)
industry:data_ai_consulting service_provided:ai-ml-development@high rating>=4 has:clutch
Sub-niche → keyword/tag mapping:
| User asks for | Use |
|---|---|
| AI/ML model building | service_provided:ai-ml-development |
| Data engineering / pipelines | service_provided:data-analytics + keywords pipelines / engineering (no data-engineering tag) |
| BI / analytics | service_provided:data-analytics (covers BI too — no separate business-intelligence tag) |
| Cloud architecture for data/ML | service_provided:cloud-services |
| API / data integration | service_provided:api-integration |
| LLM apps / RAG / agents | llm, rag, agent (keywords) |
| Generative AI | "generative ai", genai |
| Computer vision | "computer vision", cv |
| NLP / IDP / document understanding | nlp, idp, "document understanding" |
| MLOps / model deployment | mlops, deployment |
| Recommendation systems | recommendation, recsys |
| Predictive analytics / churn / forecasting | predictive, forecasting, churn |
Identifying firms — apex
Firms are identified by their apex domain (scaleai.com, not
www.scaleai.com/about).
Recipes
A. AI/ML consultancy for a recommendation engine
User: "AI/ML consultancy to build our recommendation engine for an ecommerce site."
GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:ai-ml-development+recommendation+ecommerce&limit=10
# Present, get pick of 3. "Unlocking 3 = 30 credits, 30-day TTL."
POST /v1/datasets/pro_services/unlocks
{ "apexes": ["firm-a.com", "firm-b.com", "firm-c.com"] }
B. RAG / LLM consultancies for a chatbot
User: "Three RAG/LLM consultancies for an enterprise chatbot."
GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+(rag OR llm)+chatbot+enterprise&limit=10
If thin, drop enterprise and surface client-tier signals from the
unlocked detail later.
C. Data engineering partner
User: "Data-engineering partner to build our analytics pipelines."
No data-engineering tag — data-analytics is the closest and
covers both BI and engineering. Pin the tag plus keyword:
GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:data-analytics+(pipelines OR engineering)&limit=10
D. MLOps for model deployment
GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+mlops&limit=10
E. Indirect intent — "use AI to predict customer churn"
User: "We want to use AI to predict customer churn — who can help us build that?"
GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:ai-ml-development+(churn OR predictive)&limit=10
Or let the translator do the mapping:
POST /v1/datasets/pro_services/translate-intent
{ "intent": "AI consultancy to build customer churn prediction" }
F. Computer vision + healthcare vertical
GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+"com