Explore Repo

Explore a codebase the way experienced contributors do. by understanding the architecture, the patterns, and the domain language before touching anything.

Purpose

Different from oss-prep-to-contribute (which is issue-specific and focused on one code path). This skill is for building broad understanding of a repo. Useful when a contributor wants to become a regular contributor rather than make a single drive-by PR. Also useful for GSoC candidates who need to demonstrate deep project understanding in their proposals.

Prerequisites

A repo cloned locally
gh CLI authenticated
A reason to explore (casual learning, planning to contribute, GSoC proposal, evaluating the project)

Process

1. Understand the contributor's goal

Before exploring anything, ask:

"Why are you exploring this repo? (Casual learning / planning to contribute regularly / GSoC proposal / evaluating whether to use it)"
"How much time do you want to spend? (Quick overview / deep dive)"

This shapes the depth. A GSoC candidate needs deep understanding. Someone evaluating a library needs a quick architecture scan.

2. Map the project from the outside in

Start with what the project DOES, not what the code looks like. Read:

# Project identity
cat README.md
cat docs/index.md 2>/dev/null || cat docs/README.md 2>/dev/null

# What problem does it solve?
gh api repos/{owner}/{repo} --jq '{description, homepage, topics, language, stargazers_count, open_issues_count}'

The user should be able to explain what the project does to a non-technical person before reading a single source file.

Thinking gate:

"Explain what this project does in one sentence. Who uses it? What problem does it solve? Don't use the README's words. rephrase it as if you're explaining to a friend who doesn't code."

If the user can't do this clearly, they need to read more docs before touching code.

3. Understand the architecture

Use Explore agents to map:

# Directory layout
ls -la
ls src/ lib/ app/ 2>/dev/null
ls -la */

# Entry points
cat package.json 2>/dev/null | jq '.main, .bin, .scripts'
cat setup.py 2>/dev/null || cat pyproject.toml 2>/dev/null
cat Makefile 2>/dev/null | head -30
cat Cargo.toml 2>/dev/null | head -30

# Key abstractions
grep -rn "class \|interface \|trait \|type \|struct " src/ lib/ \
  --include="*.ts" --include="*.py" --include="*.go" --include="*.rs" --include="*.java" | head -40

Present a structured architecture summary:

Entry points and their flow
Module boundaries (what talks to what)
Key abstractions (interfaces, base classes, core types)
Data flow: how information moves through the system
Build system and dependency structure

4. Learn the domain language

Every codebase has its own vocabulary. Find the terms that appear everywhere:

# Domain-specific terms in variable/function/class names
grep -rn "class \|def \|function \|fn \|func " src/ lib/ --include="*.ts" --include="*.py" --include="*.go" --include="*.rs" | \
  grep -oP '(class|def|function|fn|func)\s+\w+' | sort | uniq -c | sort -rn | head -20

# Comments that define domain concepts
grep -rn "// \|# \|/// \|/\*\*" src/ lib/ --include="*.ts" --include="*.py" --include="*.go" --include="*.rs" | grep -i "represents\|defines\|a .* is\|means" | head -15

# Glossary in docs (if it exists)
find docs/ -name "*glossary*" -o -name "*terminology*" -o -name "*concepts*" 2>/dev/null

Present terms the contributor must understand to read the code fluently. Group by importance. which terms appear in nearly every file vs which are module-specific.

Thinking gate:

"Pick 3 domain terms from the list above. Define each in your own words. Then find one place in the codebase where each is used. (This checks whether you can read the code, not just the summary I gave you.)"

5. Identify patterns and conventions

What patterns does this codebase follow? Investigate:

# Error handling approach
grep -rn "try\|catch\|except\|Error\|Result\|unwrap\|panic" src/ lib/ --include="*.ts" --include="*.py" --include="*.go" --include="*.rs" | head -20

# Testing patterns
ls test/ tests/ __tests__/ spec/ 2>/dev/null
cat test/*.{ts,py,go,rs} 2>/dev/null | head -40

# How new features get added - look at recent PRs
gh pr list -R {owner}/{repo} --state merged --limit 5 \
  --json title,changedFiles,additions,deletions \
  --jq '.[] | {title, files: .changedFiles, adds: .additions, dels: .deletions}'

Identify:

Error handling approach (exceptions? Result types? error codes?)
Testing patterns (unit vs integration vs e2e, mocking strategy, test file naming)
Dependency injection or service registration
Configuration management
Logging conventions
How new features get added (is there a pattern to follow?)

Thinking gate:

"If you were adding a new feature to this repo, describe the steps. which files would you create, what patterns would you follow, where would you add tests? Don't worry about getting it right. I'll tell you what you missed."

Review the user's answer. Point out conventions they missed without giving the full answer.

6. Read recent history

What's actively being worked on?

# Recent merged PRs - what areas are changing?
gh pr list -R {owner}/{repo} --state merged --limit 10 \
  --json title,mergedAt,changedFiles --jq '.[] | {title, merged: .mergedAt, files: .changedFiles}'

# Open issues with most activity
gh issue list -R {owner}/{repo} --state open --sort comments --limit 10 \
  --json number,title,comments --jq '.[] | {number, title, comments}'

# Recent releases
gh release list -R {owner}/{repo} --limit 5

# Changelog
cat CHANGELOG.md 2>/dev/null | head -50

Present:

Areas actively being developed vs areas in maintenance mode
Topics generating the most discussion
Release cadence (weekly? monthly? sporadic?)
Where a new contributor's effort would be most valued

7. Identify knowledge gaps

Based on the exploration, what does the contributor still not understand?

List areas that were unclear during steps 3-6
Point to specific files or modules that need deeper reading
Suggest which area to explore next based on their goal (step 1)

If the repo uses technologies the user isn't familiar with, suggest → oss-learn-stack.

8. Create a personal map

The user writes their own architecture summary. Not a copy of the LLM's summary from step 3. Their own version in their own words.

Thinking gate:

"Write a 5-10 line summary of this repo's architecture. Include:

What it does (one sentence)

How it's structured (main modules and their roles)

The main patterns (error handling, testing, config)

One thing that surprised you

This is YOUR mental model. It doesn't need to be perfect. it needs to be yours."

Review their summary. Flag anything incorrect but don't rewrite it.

Related Skills

Next step (found an issue): → oss-find-issue: find an issue that matches your new understanding
Next step (find your own): → oss-find-real-issues: use your understanding to find real code problems
If tech gaps surfaced: → oss-learn-stack: learn unfamiliar technologies from the repo itself
Issue-specific prep: → oss-prep-to-contribute: once you have an issue, prepare specifically for it

Anti-patterns

DO NOT dump the entire codebase structure. guide the user through it layer by layer
DO NOT skip the domain language step. code fluency requires vocabulary
DO NOT treat this as a replacement for reading code. the user must read actual files, not just summaries
DO NOT confuse this with oss-prep-to-contribute. this is general exploration, not issue-specific preparation
DO NOT rush through thinking gates. the user's ability to explain the architecture in their own words IS the outcome

oss-explore-repo

How to add

Drop this on your repo README

Related skills

webapp-testing

brand-guidelines

frontend-design

web-artifacts-builder

Get new Design e Frontend skills every Monday

Explore Repo

Purpose

Prerequisites

Process

1. Understand the contributor's goal

2. Map the project from the outside in

3. Understand the architecture

4. Learn the domain language

5. Identify patterns and conventions

6. Read recent history

7. Identify knowledge gaps

8. Create a personal map

Related Skills

Anti-patterns

Comments · No comments