Context Budget Analysis

Calibration: Tier 3, Opus-primary. See repository README for model compatibility.

Analyze Claude Project context budget health. Determine operating mode (full-context vs. retrieval). Produce tiered file placement recommendations with a strategy matched to the project's growth trajectory, work-phase timing, and content routing needs.

Important

Token estimates are inherently approximate. Always label token counts as "estimated" and note a ±15% margin. Never state a precise token count as fact. Use ls -la /mnt/project/ for byte counts and divide by 4 for rough token estimates.

Evaluate files objectively. Users may resist demoting files they authored or consider important. Score every file on the six dimensions regardless of stated preferences. If a file scores as Tier 3, say so clearly.

Do not assume full-context is always the goal. With a ~66,500 token ceiling, many legitimate projects operate in retrieval mode. Assess the user's workload pattern before prescribing optimization strategy.

Skills and MCPs cannot trigger RAG. Do not recommend reducing Skills or disconnecting MCPs to recover knowledge file headroom. The two budget pools are independent for threshold purposes.

Context quality degrades with total context size, not just at limits. Even with tokens remaining, every token of overhead reduces the model's attention budget. Minimizing overhead improves output quality on every turn.

Compression has a quality floor. Not all content is equally compressible. Classify content type before recommending compression depth. See references/compression-execution.md for the Content-Type Classification (Type A prose vs. Type B precision reference) and structured off-ramps when compression stalls.

Optimize for where the project is going, not just where it is. A plan that restores full-context mode for one session before the next file addition pushes it back into RAG is a cleanup, not a strategy. Always assess growth trajectory before producing an optimization plan.

Model requirements

This Skill performs per-file evaluation against the six File Evaluation Dimensions, growth trajectory analysis, content routing decisions, and phased optimization planning with compression safeguards. Opus is recommended, with effort set to high or xhigh when the deployment context allows it. On Opus at default Adaptive effort, per-file evaluation and compression quality judgment may compress — set effort higher for intelligence-sensitive audits.

On non-Opus models (Sonnet 4.6, Haiku 4.5 with extended thinking enabled), expect compressed per-file evaluation, surface-level tier recommendations, and reduced synthesis across the growth trajectory. Quick Diagnostic mode degrades less than Full Budget Audit mode. The Skill will execute and produce correctly-shaped output; users should weight findings accordingly. Haiku without extended thinking is not a supported deployment target for this Skill.

Core Concepts

Operating Modes

Claude Projects operate in one of two modes:

Full-context mode: All knowledge files loaded into context every turn. Default when total knowledge file tokens are under the threshold.
Retrieval mode (RAG): Knowledge files searched via project_knowledge_search; only relevant chunks loaded per turn. Activates when knowledge files exceed the token threshold.

Detection: Check whether project_knowledge_search is present in available tools. If present, retrieval mode is active. The "Indexing" label in the project UI files panel is also a visible indicator.

The Two-Pool Budget Architecture

Budget Pool	Allocation (200K window)	Contents	RAG Impact
Knowledge file budget	~66,500 tokens (~33%)	Knowledge files only (all types: .md, .pdf, .docx, .xlsx, .csv, .txt, images, GitHub repos)	Exceeding triggers RAG
Conversation budget	~133,500 tokens (~67%)	Platform overhead, Skills, MCPs, CI, Memory, preferences, conversation, responses	Cannot trigger RAG

Key implication: Adding Skills, connecting MCPs, or expanding Custom Instructions cannot trigger RAG mode. Conversely, reducing them cannot recover knowledge file headroom. The two pools are independent.

Token Estimation

Estimate knowledge file tokens: run ls -la /mnt/project/ for byte sizes, divide each by 4, sum. Compare against the ~66,500 token threshold.

Conversion rates: English prose ≈ 4 characters/token. Structured content (XML, code, tables) ≈ 3.25 characters/token. Mixed markdown ≈ 3.75 characters/token. Estimates are ±15% per file, ±10% for project totals.

Important: Include ALL knowledge file sources: uploaded files (all types), GitHub-connected repositories, and any other connected content sources.

Threshold-Exempt Overhead

Skills, MCPs, and other non-knowledge-file components are threshold-exempt. They affect conversation runway and attention quality but cannot trigger RAG. Key overhead sources: platform system prompt (~20–25K tokens), Custom Instructions (bytes ÷ 4), Memory (typically 500–3K tokens).

MCP loading modes: MCP overhead varies dramatically by loading mode. Deferred/load-as-needed (the claude.ai default) adds ~40–60 tokens per connector via a lightweight catalog plus ~5–7K flat for the deferred infrastructure — ~85% less than always-loaded mode (~3K–15K per connector). When the loading mode is unknown, note the estimate as a range and flag the uncertainty. See references/compression-execution.md for detailed overhead estimation.

Estimated conversation runway (assumes deferred MCP loading): lean project ~105K–110K tokens, moderate ~95K–105K, typical (6–8 MCPs, 15+ skills) ~85K–100K, heavy (always-loaded MCPs) ~50K–80K.

Context Window Sizes by Plan

Plan	Context Window	Knowledge File Ceiling
Pro / Max / Team	200K tokens	~66,500 tokens (empirically measured)
Enterprise	500K tokens	~165,000 tokens (estimated at 33%, untested)
API	1M tokens (GA for Opus 4.6, Sonnet 4.6)	User-controlled (no platform RAG)

Operating Tiers

Tier	Knowledge File Tokens	Status
1	Under ~30K	Comfortable headroom. Focus on structural quality.
2	~30K–50K	Moderate. Optimization beneficial but not urgent. Monitor growth.
3	~50K–66K	Approaching threshold. Proactive optimization recommended.
4	~66K–500K	Retrieval mode. Optimize for recovery (if near) or retrieval quality.
5	Over ~500K	Heavy retrieval. Optimize for retrieval quality. Surface API deployment for cross-document synthesis workloads.

File count is not a factor in RAG activation. Optimize file count for content organization and retrieval quality.

Context Pressure vs. RAG Switching

Two independent mechanisms. RAG switching is project-level and static (knowledge files exceed threshold). Context pressure is conversation-level and dynamic (conversation approaches 200K limit). A project can experience both, either, or neither.

Diagnostic shortcuts: "Claude forgot my instructions" → context pressure OR RAG (behavioral rules not retrieved). "Claude can't find content in my files" → RAG (content not retrieved) OR context pressure (compacted). "Responses getting shorter/weaker" → context pressure.

In full-context mode, compaction summarizes conversation history but leaves knowledge files intact. In retrieval mode, knowledge files load fresh each turn via search (not subject to compaction) but conversation history still compacts.

Placement Tiers

Files are classified into three tiers based on six evaluation dimensions (see references/evaluation-rubric.md for detailed scoring criteria):

Tier 1 — Must Keep: High cross-reference density, behavioral content, high query frequency, severe degradation if absent.
Tier 2 — Optimize or Relocate: Moderate scores. Candidates for compression, relocation, or restructuring.
**Tier 3 —

rootnode-context-budget

Como adicionar

Cole no README do seu repo

Skills relacionadas

webapp-testing

brand-guidelines

frontend-design

mcp-builder

Receba novas skills de Design e Frontend toda segunda