SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

investigate-repo

Segurança

Audit a third-party code repository (usually a GitHub URL) for security problems, malicious code, supply-chain risk, and low-quality / dangerous patterns before adopting or running it. Use when the user asks to "investigate this repo", "is this repo safe", "audit this GitHub repo", "look for malicious code in", "vet this dependency", or shares a repo URL and asks whether it's trustworthy.

1estrelas
Ver no GitHub ↗Autor: zcaceresLicença: MIT

investigate-repo

Deep security and quality audit of an unfamiliar code repository. Clone locally, walk the tree, and report concrete findings with file:line citations. Goal: give the user a defensible "safe / suspicious / dangerous" verdict, not a vibe check.

When to Use This Skill

Trigger when the user:

  • Shares a GitHub/GitLab/Bitbucket URL and asks whether it's safe, malicious, legit, trustworthy, or worth using
  • Says "investigate this repo", "audit this repo", "look for security problems in", "check for malicious code", "vet this dependency / package / SDK"
  • Asks for a code-quality assessment of an unfamiliar third-party project
  • Is about to install, fork, or run code from a repo they don't already trust

Do NOT use for: the user's own repo, code review of a PR, refactoring tasks, or general "is library X good" questions where no specific repo is named.

Operating Principles

  • Evidence over vibes. Every finding must cite a file path and (where possible) line numbers.
  • Read-only on the clone. Never edit, run, install, or execute code from the target repo. No npm install, no pip install, no bun install, no running scripts. Static analysis only.
  • Clone to a throwaway location under /tmp or $TMPDIR. Never clone into the user's working directory.
  • Bias toward "explain what's actually there", not toward exoneration or condemnation. If something looks bad but is benign on inspection, say so and show why.
  • Time-box. A deep audit on a medium repo should finish in one pass — don't grep the same patterns repeatedly across the tree.

Workflow

1. Resolve the target

If the user gave a URL, extract owner/repo. If they gave only a name, ask which repo (1 question, enumerated).

Fetch repo metadata before cloning so you can size the job and spot red flags:

gh repo view <owner>/<repo> --json name,description,createdAt,updatedAt,pushedAt,stargazerCount,forkCount,isArchived,isFork,licenseInfo,defaultBranchRef,primaryLanguage,diskUsage 2>/dev/null

If gh isn't available or the repo is private/inaccessible, fall back to WebFetch on the GitHub repo page.

Flag immediately (but keep going):

  • Account age < 6 months
  • Last push > 2 years ago combined with active "use me" framing
  • Very few stars but aggressive marketing language in README
  • No license / unusual license
  • Repo is a fork of a more popular project (possible typosquat)

2. Clone shallow into a temp dir

TARGET_DIR="$(mktemp -d -t investigate-repo-XXXXXX)"
git clone --depth 50 --no-tags https://github.com/<owner>/<repo>.git "$TARGET_DIR/repo"

Record $TARGET_DIR and use it for every subsequent command. Never cd into it for edits.

3. Map the repo

# Top-level layout
ls -la "$TARGET_DIR/repo"

# File counts by extension (helps spot bundled minified blobs)
find "$TARGET_DIR/repo" -type f -not -path '*/.git/*' \
  | sed -E 's/.*\.([A-Za-z0-9]+)$/\1/' | sort | uniq -c | sort -rn | head -30

# Largest files (often where obfuscated payloads hide)
find "$TARGET_DIR/repo" -type f -not -path '*/.git/*' -printf '%s %p\n' 2>/dev/null \
  | sort -rn | head -20

On macOS find -printf is unavailable — use find ... -exec stat -f '%z %N' {} \; instead.

4. Read the headline files first

In this order, read fully (not just skim):

  1. README* — what does it claim to do?
  2. package.json / pyproject.toml / Cargo.toml / go.mod / requirements.txt — declared deps and scripts
  3. .github/workflows/* — CI may run on contributor machines
  4. Any install*, setup*, postinstall*, preinstall* scripts
  5. Entry point(s) named in the manifest (main, bin, etc.)
  6. LICENSE

Compare README claims against actual code surface area. A "simple browser SDK" with native binaries, install scripts, and outbound HTTP calls is a smell.

5. Run the malicious-pattern sweep

Use ripgrep (rg) if installed, otherwise grep -rEn. Search the whole tree minus .git, node_modules, dist, build, vendor. For each hit, open the file and read context — do not report raw grep lines as findings.

Code execution / dynamic loading:

rg -n --hidden -g '!.git' -g '!node_modules' -g '!dist' -g '!build' \
  -e '\beval\s*\(' \
  -e 'new\s+Function\s*\(' \
  -e 'child_process' -e 'execSync' -e 'spawnSync' -e 'execFile' \
  -e 'os\.system' -e 'subprocess\.(Popen|call|run)' -e '__import__\s*\(' \
  -e 'pickle\.loads' -e 'marshal\.loads' -e 'yaml\.load\b' \
  -e 'Runtime\.getRuntime\(\)\.exec' \
  "$TARGET_DIR/repo"

Network exfiltration / beacons:

rg -n --hidden -g '!.git' -g '!node_modules' \
  -e 'fetch\s*\(' -e 'axios\.' -e 'XMLHttpRequest' \
  -e 'requests\.(get|post)' -e 'urllib' -e 'http\.client' \
  -e 'net\.Socket' -e 'dgram' -e 'WebSocket\s*\(' \
  -e '\bhttps?://[^"'\'' ]+' \
  "$TARGET_DIR/repo"

Then sort -u the URLs and ask: are these documented? Do they go to the project's own domain, or somewhere unexplained?

Credential / token harvesting:

rg -n -e 'process\.env\b' -e 'os\.environ' \
  -e '\.npmrc' -e '\.aws/credentials' -e 'id_rsa' -e '\.ssh/' \
  -e 'keychain' -e 'LocalStorage' -e 'document\.cookie' \
  -e 'Authorization\s*[:=]' -e 'Bearer\s+' \
  "$TARGET_DIR/repo"

Obfuscation:

# Long base64-looking strings
rg -n --pcre2 '[A-Za-z0-9+/]{200,}={0,2}' "$TARGET_DIR/repo"
# \x-encoded blobs
rg -n '(\\x[0-9a-fA-F]{2}){20,}' "$TARGET_DIR/repo"
# Hex blobs
rg -n '[0-9a-fA-F]{200,}' "$TARGET_DIR/repo"
# Minified JS shipped as source (extremely long lines)
find "$TARGET_DIR/repo" -name '*.js' -not -path '*/node_modules/*' \
  -exec awk 'length>5000 {print FILENAME":"NR":"length; nextfile}' {} \;

Install-time / build-time hooks (highest risk — run with user privileges on npm install):

# package.json scripts
rg -n --json '"(preinstall|install|postinstall|prepare)"\s*:' \
  "$TARGET_DIR/repo" --type-add 'json:*.json' -tjson || \
rg -n '"(preinstall|install|postinstall|prepare)"\s*:' "$TARGET_DIR/repo" -g '*.json'

# Python equivalents
rg -n -e 'setup\.py' -e 'cmdclass' -e 'PostInstallCommand' "$TARGET_DIR/repo"

Binary payloads / native code:

find "$TARGET_DIR/repo" -type f \
  \( -name '*.exe' -o -name '*.dll' -o -name '*.so' -o -name '*.dylib' \
     -o -name '*.node' -o -name '*.wasm' -o -name '*.bin' \) \
  -not -path '*/node_modules/*' -not -path '*/.git/*'

For each binary, check whether it's referenced from code, whether a source build exists, and whether the README explains its provenance.

6. Dependency review

# Node
[ -f "$TARGET_DIR/repo/package.json" ] && cat "$TARGET_DIR/repo/package.json" | head -100
# Python
[ -f "$TARGET_DIR/repo/requirements.txt" ] && cat "$TARGET_DIR/repo/requirements.txt"
[ -f "$TARGET_DIR/repo/pyproject.toml" ] && cat "$TARGET_DIR/repo/pyproject.toml"

For each declared dependency, flag:

  • Git URLs or tarball URLs (bypasses registry review)
  • Typosquats of popular packages (e.g. reqests, lodahs)
  • Pinned to a single commit hash with no version
  • Dependencies you've never heard of doing low-level system work

Do not run npm audit / pip-audit — that requires install. Note the limitation in the report.

7. Git history sanity check

# Recent commit cadence and authors
git -C "$TARGET_DIR/repo" log --pretty='%h %ad %ae %s' --date=short -n 30

# Any force-push markers, mass rewrites?
git -C "$TARGET_DIR/repo" log --all --pretty='%h %s' -n 50 | grep -iE 'force|rewrite|squash' | head

# Single-author repo? (concentration risk)
git -C "$TARGET_DIR/repo" shortlog -sne HEAD | head

8. Synthesize the report

Output a markdown report in the chat with this exact structure:

# investigate-repo: <owner>/<repo>

**Verdict:** SAFE / SUSPICIOUS / DANGEROUS / INCONCLUSIVE — one line of justification.

## Repo at a glance
- Stars / forks / age / last push / license / primary language
- Stated purpose (1 sentence from README)
- What the code a

Como adicionar

/plugin marketplace add zcaceres/claude-investigate-repo

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.