SSkilltecabyclaudinhocode
Enviar skill
← Voltar para o catálogo

repo-estimator

Desenvolvimento

Analyze a code repository and produce a detailed human time and cost estimate for building it from scratch. Use this skill whenever the user wants to know how long it would take a human developer or team to build a codebase, how much it would cost to recreate a project, what the engineering effort of a repo is, or wants a "what's this worth?" / "how much work went into this?" type of analysis. Als

0estrelas
Ver no GitHub ↗Autor: teterougeLicença: NOASSERTION

Repo Estimator

This skill produces a thorough, defensible estimate of the human time and cost required to build a codebase from scratch. It's designed for founders evaluating acquisitions, engineers scoping migrations, freelancers pricing projects, or anyone who wants to understand the real effort baked into a repository.

The output should feel like something a senior engineering consultant would hand over — not a naive line-count calculation, but a judgment-call-rich analysis that accounts for complexity, architecture, rework cycles, and team dynamics.

What You're Estimating

You're answering: "If a competent team started from zero today, how long would it take and what would it cost to build what's in this repo?"

This is subtly different from:

  • "How long did it take the original team?" (You don't know their pace, mistakes, or false starts)
  • "How many lines of code are there?" (LOC is a famously poor proxy for effort)
  • "What's the maintenance cost?" (That's a different question — don't conflate it unless asked)

Assume the hypothetical team is competent but unfamiliar with the domain. They need to design, build, test, and document — not just type.


Step 1: Understand the Request

Before diving in, understand what the user actually needs:

  • Target: Is this a local path, a GitHub URL, or a zip? Handle accordingly.
  • Purpose: Are they buying/selling, scoping a rewrite, hiring, or just curious? This shapes how you frame the output.
  • Team assumption: Should you estimate for a solo developer, a small startup team (2–4), or a mid-size engineering org? Ask if unclear — it affects hours significantly.
  • Rate card: Do they want costs in USD? A specific region or seniority mix? Default to US market rates if unspecified (see references/rate-cards.md).

If the user provides a GitHub URL, use shell tools to clone it to a temp directory. If they provide a local path, work from there directly.


Step 2: Repository Reconnaissance

Run both analysis scripts to gather hard data before making any judgment calls:

# Source code analysis
python scripts/analyze_repo.py <repo_path>

# Log, validation, and artifact analysis
python scripts/scan_logs_and_validation.py <repo_path>

The first script covers source code composition and complexity signals. The second surfaces evidence of effort that lives outside the code itself: test output, compliance documents, migration histories, CI/CD configs, changelogs, ADRs, and more. Both are needed for a complete picture.

Also do a manual walkthrough of the repo structure. The scripts catch what they can measure; you catch what requires judgment:

  • Read the README, any architecture docs, and top-level config files
  • Note the overall architecture pattern (monolith, microservices, monorepo, etc.)
  • Identify the primary language(s) and frameworks
  • Look for evidence of complexity: auth systems, payment integrations, real-time features, ML pipelines, complex state management, multi-tenancy, etc.
  • Check test coverage and quality — well-tested code represents more total work than the implementation alone
  • Note infrastructure-as-code, CI/CD pipelines, and DevOps configuration
  • Identify any non-obvious work: data migrations, seed scripts, custom tooling, generated code
  • Note whether this appears to be a product run by a human team (multiple contributors, release history, PM artifacts) vs. a solo side project

Step 3: Complexity Classification + Rebuild Difficulty Rating

This step produces two distinct outputs that serve different purposes. Do both before moving on.

3a. Component Tier Classification

Use the complexity taxonomy in references/complexity-guide.md to classify each major component of the codebase.

Every component falls into one of four tiers:

TierLabelDescription
1BoilerplateStandard scaffolding, CRUD, config files, generated code
2ModerateCustom business logic, non-trivial integrations, standard auth
3ComplexCustom algorithms, real-time systems, complex state, multi-service orchestration
4SpecializedML/AI pipelines, custom protocols, novel architecture, research-grade work

Be honest about tier assignment. The biggest estimation errors come from miscategorizing Tier 3 work as Tier 2. When in doubt, round up.

3b. Rebuild Difficulty Rating

This is not the same as component complexity. Rebuild difficulty answers: How hard would it be for a competent team to reconstruct the knowledge encoded in this repo from scratch?

LOC and tier classifications measure volume and technical sophistication. Rebuild difficulty measures knowledge density — the specialized understanding, institutional context, and hard-won production experience that can't be acquired by reading the code alone.

Read references/rebuild-difficulty.md for the full scoring model. The analyze_repo.py script outputs a rebuild_difficulty block in its JSON — use that as a starting point, but apply your own judgment based on what you observed in the manual walkthrough.

Score the repository across five dimensions:

DimensionWhat It MeasuresSource
Domain KnowledgeSpecialized domains (fintech, healthcare, crypto, compilers, etc.)Script + manual review
Infrastructure CouplingDepth of infra-as-code, k8s, Terraform, GitOpsScript + key files
Data Model ComplexityTables, migrations, schema evolution depthScript counts + migrations dir
Integration Surface AreaExternal API count, enterprise API weightScript + package files
Operational MaturitySLOs, runbooks, load tests, chaos engineeringScript + scan_logs output

Compute the composite score and assign a rating:

ScoreRatingEffort Multiplier
0–2LOW1.0×
3–4MODERATE1.1–1.2×
5–7HIGH1.25–1.45×
8–10VERY HIGH1.5–1.7×
11–14EXTREME1.9–2.5×
15+EXCEPTIONAL2.5–4.0×

This multiplier is applied at the end of Step 9, after all other multipliers, as a final adjustment to total hours. It represents the knowledge ramp-up cost that component-level estimation systematically misses.

Always be specific about what drives the rating. "VERY HIGH (score: 9) — fintech payment domain (+2), Kubernetes+Terraform (+2), 67-table data model with 182 migrations (+3), 14 external integrations (+2)" is a defensible finding. "VERY HIGH" alone is not.


Step 4: Component Breakdown

Decompose the codebase into logical components. Good components are things a project manager would actually track — not individual files, not vague categories like "backend."

Examples of good component granularity:

  • User authentication & authorization system
  • Payment processing integration (Stripe, etc.)
  • Admin dashboard UI
  • REST API layer
  • Real-time notification system
  • Data pipeline / ETL jobs
  • Infrastructure & deployment configuration
  • Test suite
  • Documentation

For each component, estimate:

  • Tier: 1–4 (from above)
  • Raw hours: Core implementation time for a competent solo developer
  • Complexity multiplier: From references/complexity-guide.md
  • Adjusted hours: Raw × multiplier

Don't round aggressively. "80 hours" feels more credible than "80–120 hours" for a component you can actually analyze.


Step 5: Apply Estimation Multipliers

Raw component hours are never the full story. Apply these multipliers to the total adjusted hours:

Rework & iteration factor: 1.3–1.6× Real development isn't linear. Design changes, bugs, PRs, re-architecting decisions. Use 1.3× for simple projects, 1.6× for complex or novel ones.

Testing & QA factor: depends on test coverage observed

  • No tests: add 0% (but note it in caveats — the hours are "artificially low")
  • Light tests: add 15%
  • Thorough unit tests: add 25%
  • Full test suite with integration/e2e:

Como adicionar

/plugin marketplace add teterouge/codeworth

O comando exato pode variar conforme o repositório. Confira o README no GitHub.

Comentários · Nenhum comentário

Entre para comentar. Entrar

  • Ainda não há comentários. Seja o primeiro.