People Sourcer

A real recruiter, BD person, or research lead doesn't open a CRM first. They start with a question: who specifically am I trying to reach, and why? Then they hunt — across whichever platform that tribe actually lives on — and they keep notes. This skill makes Claude work that way, end-to-end, into a spreadsheet the user can act on.

Core premise

Spam happens when you compile names without context. A list with 200 anonymous LinkedIn URLs is worse than 30 rows where each one has a real signal — this person posted last week about exactly your problem, here's how to enter their world.

So the rule is: never source from zero. Always source from signal. Find the place where the right people are already self-identifying, scrape that signal, then enrich and personalize. The personalization is what makes the difference between a useful list and noise.

When to use this skill vs. just Google

Use this skill when the deliverable is a list of named individuals with structured fields. Don't use it for:

Aggregate research ("how big is the X market") — use web search.
Finding a single specific named person — just web search + verify.
Company lists without people attached — that's account research.

If in doubt: if the user wants rows in a spreadsheet with names and an outreach angle, this is the skill.

The workflow

Six phases. Each writes to a scratchpad so context survives long runs.

1. Intake          → pin down WHO and WHY
2. Source strategy → pick platforms + queries
3. Discovery       → iterative BrightData scraping
4. Enrichment      → per-person profile + contact pull
5. Personalization → worldbuilder commentary per row
6. Output          → multi-sheet xlsx

Skip phases that are already done. If the user hands you a list of profile URLs and just wants enrichment + commentary, jump to Phase 4.

Phase 0: Scratchpad first

Before scraping anything, create a scratchpad so you don't lose the thread mid-run.

mkdir -p /home/claude/sourcing-work/<project-slug>/raw
touch /home/claude/sourcing-work/<project-slug>/brief.md
touch /home/claude/sourcing-work/<project-slug>/candidates.jsonl

brief.md — the persona, query plan, and audience model. See references/scratchpad-template.md.
candidates.jsonl — one JSON line per candidate, appended as you find them. JSONL because you'll be writing as you scrape, and a corruption in one line doesn't kill the whole file.
raw/ — raw scrapes by source URL, named like linkedin-search-1.json, reddit-r-netsec-1.md, etc.

Why JSONL for candidates: you'll likely process 30–500 people across multiple rounds. Mid-run failures shouldn't lose progress. Append-only is the right shape.

Phase 1: Intake

Pin down the brief in brief.md before doing anything else. The single most expensive mistake in sourcing is scraping the wrong audience well.

Required:

Persona — role/title, seniority, function. Be precise: "senior backend engineer with Rust experience" not "good engineer."
Signals — what publicly-visible behavior identifies them? They contributed to repo X. They posted about Y last quarter. They list Z certification. They lead a meetup on W. Without signals, you're guessing.
N — how many do they want? 20 ≠ 200 in workflow shape.
Purpose — recruiting? sales? podcast guests? user research? This determines the "outreach angle" column entirely.
Geography / language — global? specific country/city? English-only?
Custom fields — anything beyond defaults the user wants captured.
Output preference — xlsx (default), Google Sheet via Drive (if connected), or CSV.

If the brief is vague ("find me ML people"), ask 1–2 sharp questions before scraping. Don't ask a wall — ask the ones that actually change the search:

"Are you looking to hire them, sell to them, or interview them? It changes who I prioritize."
"Any specific signal — open-source contribs, conference talks, recent job changes — that should weight my search?"

If the user is decisive ("just find me 50 senior MLEs in Bangalore who post about RAG"), skip the questions and go.

Phase 2: Source strategy

Pick platforms based on where the persona actually lives. See references/source-matrix.md for the full decision table; the short version:

Persona	Primary platform	Why
B2B/SaaS buyers, execs, recruiters' candidates	LinkedIn	Self-identified work history, public posts
Devs / technical talent	GitHub + Reddit + X (formerly Twitter) + LinkedIn	Code is the signal; posts are the noise
Indie hackers / founders	X, IndieHackers forum, ProductHunt, LinkedIn	Where they ship and gripe
Security / pentesting	Reddit (r/netsec, r/AskNetsec, r/oscp), X infosec, ctftime, conference speaker pages	Tribe is small, vocal, identifiable
Researchers / academics	Google Scholar, arXiv, ResearchGate, university pages, Twitter/X	Citations + author pages
Creators / influencers	YouTube, TikTok, Instagram, Twitter/X	Platform IS the work
Local community / event attendees	Facebook events, Meetup, local subreddits, Eventbrite	Hyperlocal
Journalists / writers	Twitter/X, Muck Rack, bylines on outlet sites	Bylines = identity

Plan your queries in brief.md before firing them. Write them out as a numbered list so you can reuse and iterate.

Tools

These are the BrightData tools you'll lean on — they're deferred, so call tool_search first to load them.

Goal	Tool
Find LinkedIn profiles by query	`bd:web_data_linkedin_people_search`
Pull a single LinkedIn profile (full data)	`bd:web_data_linkedin_person_profile`
Pull LinkedIn posts	`bd:web_data_linkedin_posts`
Reddit post + comments	`bd:web_data_reddit_posts`
X (Twitter) posts	`bd:web_data_x_posts`
Instagram profile / posts / reels	`bd:web_data_instagram_profiles` / `_posts` / `_reels`
TikTok profile / posts	`bd:web_data_tiktok_profiles` / `_posts`
YouTube profile / videos / comments	`bd:web_data_youtube_profiles` / `_videos` / `_comments`
Facebook posts / events	`bd:web_data_facebook_posts` / `_events`
Discovery (which subs, which writers, etc.)	`bd:search_engine`, `bd:search_engine_batch`, `bd:discover`
GitHub profiles, personal sites, niche forums, anything else	`bd:scrape_as_markdown` (or `bd:scrape_batch` for ≤10 URLs)

See references/bd-tool-cheatsheet.md for parameter examples.

Phase 3: Discovery — iterative scraping

Sourcing is not "one search and done." It's a loop where each round narrows from where-they-are to who-specifically-they-are.

Round 1 — Locate the watering holes

For each platform you picked, run a discovery query to find the places the persona congregates. Use bd:search_engine_batch to fire several at once.

Example for "senior Rust backend engineers in EU":

"senior rust" engineer site:linkedin.com europe
rust backend site:github.com followers:>200
site:reddit.com/r/rust experience hiring
rusty-days OR rustconf speaker

Don't scrape candidates yet. Just identify where they cluster — the active subreddits, the LinkedIn groups, the conference speaker pages, the GitHub orgs.

Write findings under "Round 1 — Discovery" in brief.md.

Round 2 — Pull candidates from the watering holes

Now actually pull people. Choose the right tool per source:

LinkedIn search results → bd:web_data_linkedin_people_search with structured filters (role, location, current company keywords).
A subreddit thread of "who's hiring" or "who wants a job" → bd:web_data_reddit_posts to get post + commenter usernames + their text.
A conference speaker page → bd:scrape_as_markdown on the speaker URL, then for each speaker name, search their LinkedIn / X.
A GitHub org or repo's contributors page → bd:scrape_as_markdown on /graphs/contributors.

Append each candidate as a JSONL line to candidates.jsonl imme

people-sourcer

How to add

Drop this on your repo README

Related skills

learn-codebase

remove-deadcode

sendgrid-automation

seo

Get new Marketing skills every Monday