SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

codebase-onboarding

Dados e Análise

Systematic orientation in an unfamiliar codebase. Use when joining a new team's repo, returning to your own old code after months away, or evaluating an OSS project before contributing. Builds a verified mental model — what the system does, where data flows, what the implicit conventions are, and which files are dangerous to touch first — producing a living CODEBASE.md with active modes for PR pre

1estrelas
Ver no GitHub ↗Autor: googlarzLicença: MIT

Codebase Onboarding

Systematic orientation. Stop guessing. Build the right mental model before touching anything — then keep it live as you work.

How this works: Claude runs the investigation — executes commands, reads files, traces paths — and writes CODEBASE.md as a living orientation document. The human provides the repository and answers questions that can't be found in the code. Think of it as pair programming where Claude does the archaeology and you provide context that only humans have.


When to Use

SituationMode
Joining a new team or repo for the first timejoin
Returning to your own code after 3+ months awayreturn
Evaluating an OSS project before contributingaudit
Need "what do I avoid" in 15 minutes — no time for full investigationquick
Leaving this codebase — write the handoff document you wish existedsunset
About to modify a specific file mid-ramptouch
About to push a PR — catch issues before reviewpreflight
Assigned a ticket or feature — map it to the codebasetask

Default to join if unclear. quick is a triage tool — not a substitute for full orientation. touch, preflight, and task are ongoing modes — they require an existing CODEBASE.md from a prior session.


Intake: Ask First

Before running any orientation phase (join / return / audit), ask two questions. The answers reshape every phase that follows.

Question 1: Technical profile

Ask:

"Are you a developer who can read code and run terminal commands, or are you non-technical — a PM, designer, analyst, or executive who needs to understand the system without diving into the code itself?"

Then explain the difference:

If you're technical: I'll run shell commands, read source files, trace execution paths, and map git history. Output includes code snippets, file paths, and conventions — things you can act on directly. You'll also get a local dev guide and PR pre-flight support.

If you're non-technical: I'll run all the same investigation but translate everything into plain language. No code in the output. You'll get a visual architecture diagram, priority-ranked questions for your next engineering meeting, and an executive brief you can share with stakeholders.


Question 2: Goal

Wait for the answer to Question 1, then tailor the examples:

If technical:

  • Make a contribution or fix a specific bug
  • Take ownership — become the go-to maintainer
  • Review for quality, security, or architecture concerns
  • Evaluate an OSS project before contributing
  • Get up to speed after being away for months

If non-technical:

  • Understand what the system does and how it fits together
  • Assess risk before a launch, acquisition, or vendor decision
  • Identify what's slowing the team down
  • Have a more informed conversation with engineers
  • Prepare for a roadmap, sprint planning, or board conversation

Profile + Goal → what changes:

Profile + GoalWhat changes
Technical + contributeFull workflow: Phases 0–7, local dev guide, Phase 8
Technical + own/maintainFull depth; extra attention to Danger Zones and authorship
Technical + reviewPhases 0–6; security/quality lens; skip Phase 8
Technical + evaluate OSSaudit mode — contributor signal, merge rate, PR velocity
Non-technical + understandPhases 0–6; plain language; diagram; executive brief
Non-technical + decidePhases 0–6 + recommendation section in executive brief
Non-technical + evaluateaudit mode; go/no-go framing in executive brief

Large codebases (>100k LOC): After Phase 0, ask: "Which subsystem or area is most relevant to your goal?" Scope Phases 1–4 to that area. Investigating a 500k-line Rails monolith end-to-end produces noise, not orientation.


Phase Order by Mode

Phasejoinreturnauditquicksunset
0 — Bootstrap✓ first✓ first✓ first✓ first✓ first
1 — Critical Pathsskipskip
2 — Conventions✓ after Phase 9skipskip
3 — Danger Zones✓ after Phase 9
4 — Gotcha Detector
5 — Local Dev Guidetechnical onlytechnical onlyskipskipskip
6 — Team Questionstechnical: 1:1 formattechnical: 1:1 formattechnical: 1:1 formatskipskip
non-technical: meeting formatnon-technical: meeting formatnon-technical: meeting format
7 — Executive Briefnon-technical onlynon-technical onlynon-technical onlyskipskip
8 — First Contributiontechnical onlytechnical onlyskipskipskip
8b — Ramp-up Timelinetechnical onlytechnical onlyskipskipskip
9 — Archaeologyskip✓ before Phase 2skipskipskip
10 — Contributor Signalskipskipskipskip
11 — Sunsetskipskipskipskip

In return mode: run Phase 9 (Archaeology) immediately after Phase 1. In quick mode: no CODEBASE.md written — output is a single briefing. In sunset mode: produces a Handoff Document, not a CODEBASE.md update.


Output: CODEBASE.md

CODEBASE.md
├── What This Is          # one-paragraph system description
├── Architecture Map      # Mermaid diagram + component description
├── Critical Paths        # entry points → processing → exit
├── External Integrations # third-party APIs, queues, webhooks — what needs mocking locally
├── Local Dev Guide       # technical only: step-by-step to get it running
├── Conventions           # implicit rules the README doesn't mention
├── Danger Zones          # what not to touch first, and why
├── Gotchas               # what silently burns new contributors
├── Team Questions        # technical: 1:1 format | non-technical: meeting format
├── Executive Brief       # non-technical only: one-page health summary
├── Ramp-up Timeline      # technical only: week-by-week gates derived from findings
├── Open Questions        # still unclear — actively maintained
└── Contribution Log      # join/return: changes + learnings
                          # audit: merge rate, PR velocity, go/no-go

Confidence calibration

Every section carries a confidence tag:

TagMeaning
✅ VerifiedBased on CI config, git history, or explicit documentation
⚠️ InferredBased on patterns — likely but not confirmed
❓ GapCouldn't assess from code — needs human confirmation

Gap sections automatically feed into Team Questions. If you wrote ❓, there must be a corresponding question.

Update CODEBASE.md at the end of each phase. Do not defer.


Phase 0: Bootstrap

1. README.md / README.rst    → what does it claim to do?
2. CLAUDE.md / AGENTS.md     → what has an AI already learned here?
3. CONTRIBUTING.md           → what does the team care about?
4. package.json / go.mod /
   pyproject.toml / Cargo.toml → language, deps, run scripts
5. Makefile / justfile        → available commands
6. .github/workflows/         → what CI runs — the ground truth

CI is the most honest documentation. If it conflicts with the README, CI wins.

AI-Generated Codebase Detection

Run these signals before the rest of Phase 0. AI-generated codebases have different failure modes — surface quality looks fine, but error handling is thin, edge cases aren't covered, and tests pass because they only test the happy path.

# Thin or compressed history
git log --format="%ad" --date=short | wc -l          # total commits
git log --format="%ae" | sort -u | wc -l             # distinct authors
git log --format="%ad" --date=short | tail -1        # first commit date

# Generic commit message signatures
git log --format="%s" | grep -ciE \
  "^(add|fix|update|initial commit|feat|implement|create|ref

Como adicionar

/plugin marketplace add googlarz/codebase-onboarding

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.