OSINT Methodology — External Red-Team Edition
0. When to use this skill / When NOT
Use this skill when:
- Planning or executing external reconnaissance against an authorized target (red team, bug bounty in-scope, ASM engagement).
- Mapping an organization's external attack surface end-to-end (subdomains → assets → exposure → attack paths).
- Investigating a person, entity, or threat actor where evidence discipline matters.
- Tracing cryptocurrency flows, geolocating media, performing image/video forensics, or chronolocating events.
- Building a structured OSINT campaign that needs reproducibility, severity grading, and clean handoffs.
- Producing client-facing deliverables (exec summaries, technical reports, reproduction packages) from offensive engagements.
Do NOT use this skill when:
- The user is asking for active exploitation, post-exploitation, lateral movement, AD privilege escalation, malware development, or anything beyond reconnaissance — those are out of scope.
- The user is asking for blue-team / defensive content (SIEM rules, detection engineering) — different domain.
- The target's authorization is unclear and the user is asking you to act against a third-party asset they don't own — see §1 below; gently surface the scope question before proceeding.
1. Authorization & Legal Posture
This skill is intended for assets the operator owns or has written authorization to assess (red-team rules of engagement, bug-bounty in-scope assets, ASM contracts).
Soft scope check: when a user asks you to act against a target whose authorization isn't established earlier in the conversation, ask once before proceeding:
"Quick scope check: is this a target you own or have written authorization to assess (e.g., a red-team engagement, in-scope bug-bounty asset, or your own infrastructure)? I want to make sure we stay on the right side of the engagement boundary."
Once authorization is asserted, proceed without re-asking. If the user explicitly states the engagement type (e.g., "this is for our pentest of acme.com under contract"), you don't need to ask again.
Always-on guardrails (regardless of authorization):
- Never weaken auth, rate limits, banners, or any safety control that enforces scope on the target side.
- Never run destructive probes (true SYN scans on production, masscan at line rate, fuzzing/brute-force) outside an explicit DEEP /
--aggressivemode. - Never paste real PII, valid credentials, session tokens, API keys, or other secrets into cloud-hosted LLMs or third-party services.
- Never take action against assets outside the documented scope, even if "obviously related" (subsidiaries, vendors, employees' personal accounts, etc.).
2. Confidence Levels
Every assertion you make during an engagement should carry a confidence level. Three levels:
| Level | Meaning | Examples |
|---|---|---|
| TENTATIVE | Plausible based on indirect evidence; unverified. | Snippet-only Google dork match; email pattern inferred from name; subdomain returned by one passive source only; favicon-hash overlap (two hosts share a favicon — could be shared infra, could be a coincidence). |
| FIRM | Directly observed but uncorroborated. | Subdomain that resolves to an IP; HEAD-confirmed bucket exists (private); CT-log entry shows certificate; Shodan banner returned. |
| CONFIRMED | Multiple independent corroborations OR directly verified. | Live-validated PMAK token (read-only /me returned 200); breach corpus + crt.sh + DNS all agree; bucket listable AND files retrievable; user enumerated AND password reset flow returns valid hint. |
Rule of three for attribution: require three independent weak signals, OR one strong + one weak, before asserting linkage. Don't single-source attribute.
2.1 Confidence Upgrade Workflows
Confidence isn't static — every TENTATIVE asset should have a documented path to FIRM and to CONFIRMED. Use these per-asset-type rules.
| Asset type | TENTATIVE → FIRM | FIRM → CONFIRMED |
|---|---|---|
| Subdomain | Returned by ≥2 independent passive sources, OR DNS A/AAAA/CNAME resolves successfully. | Serves on a standard port (80/443/22/etc.) AND HTTP banner / TLS cert / SSH banner returned. |
| IP | Discovered via ≥2 sources (passive DNS, ASN lookup, Shodan). | Active probe responds (TCP SYN-ACK on at least one port, or ICMP echo reply). |
| WebApp | URL extracted from JS / API / archive but not yet hit. | HTTP request returns 2xx/3xx/4xx (any non-network-error response) AND content-length > 0. |
| Generated from a name pattern OR returned by snippet-only dork. | Listed in Hunter.io / EmailRep / IntelX / breach corpus, OR MAIL FROM/RCPT TO SMTP probe returns 250 (without delivery — abort at DATA). | |
| Bucket (S3/GCS/Azure) | Permutation candidate; no probe yet. | HEAD returns 200, 301, or 403 (existence confirmed). Then CONFIRMED when GET returns object listing or known object retrieval. |
| Endpoint (API / wayback) | Extracted from JS regex / Wayback / Postman. | HTTP request returns non-404 (route exists). Then CONFIRMED when the endpoint's behavior is fingerprinted (auth posture, response shape, rate limits). |
| Credential / secret | Matches catalog regex in captured text. | Read-only validator (/me, auth.test, sts:GetCallerIdentity, /user) returns success. Then CONFIRMED with documented scope + account ID. |
| Person | Name extracted from a single source (LinkedIn / breach / GitHub commit). | Confirmed by a second source (Hunter.io role + LinkedIn profile, or two breach sources with same email). |
| Repo | Name match on org keyword in GitHub search. | Repo metadata shows confirmed org/email/website match. Then CONFIRMED when commit-history shows employee involvement. |
| Mobile app | Name match in app store. | Ownership-confidence score ≥70 (see companion skill §21). Then CONFIRMED when binary metadata (signing cert, package name, dev account) ties back to target. |
| Certificate | Returned by crt.sh once. | CT-log entry confirmed in ≥2 logs. Then CONFIRMED when serving on a discovered host. |
| SSO tenant | Discovery-endpoint returns OIDC metadata. | Tenant GUID extracted AND domain resolves through the tenant's expected MX / autodiscover / SP record. |
Default reporting posture: never claim CONFIRMED without explicit corroboration. When in doubt, downgrade. Operators trust under-claims more than over-claims.
3. Output Format Conventions
When you produce findings during an active session, structure each finding to match the schema below — it drops cleanly into asset-management tools.
Finding:
id: <stable hash or UUID>
module: <which technique discovered it; "manual" if hand-found>
asset_key: <typed key, e.g. sub:api.example.com or webapp:https://example.com/admin>
category: <e.g. SECRET_LEAK, MISSING_HSTS, OPEN_GRAPHQL_API, LEAKED_CRED, SSO_EXPOSURE>
severity: <info|low|medium|high|critical>
confidence: <tentative|firm|confirmed>
title: <one-line summary>
description: <2-5 sentences>
evidence:
url: <where it was found>
timestamp: <UTC ISO8601>
sha256: <hash of any downloaded artifact>
raw: <truncated to 2 KiB>
references:
- <CVE-ID, advisory URL, vendor doc>
remediation: <action the asset owner can take>
Always use UTC timestamps. Local time creates correlation bugs across notes/screenshots/logs.
4. Source Hygiene & Citations
For every artifact you capture, record: URL + UTC timestamp + SHA-256 hash + tool version + run_id.
- Hash all downloaded files with SHA-256.
- Screenshot in PNG (lossless, smaller than full-page WARC for evidence packs).
- Capture raw HTTP requests/responses, capped at 2 KiB body to keep evidence packs small.
- Use JSONL (NDJSON) logs, one line per event, with a
run_idso the entire engagement is replayable. - Separate evidence read-only fro