Purpose

Guide product managers through diagnosing whether they're doing context stuffing (jamming volume without intent) or context engineering (shaping structure for attention). Use this to identify context boundaries, fix "Context Hoarding Disorder," and implement tactical practices like bounded domains, episodic retrieval, and the Research→Plan→Reset→Implement cycle.

Key Distinction: Context stuffing assumes volume = quality ("paste the entire PRD"). Context engineering treats AI attention as a scarce resource and allocates it deliberately.

This is not about prompt writing—it's about designing the information architecture that grounds AI in reality without overwhelming it with noise.

Key Concepts

The Paradigm Shift: Parametric → Contextual Intelligence

The Fundamental Problem:

LLMs have parametric knowledge (encoded during training) = static, outdated, non-attributable
When asked about proprietary data, real-time info, or user preferences → forced to hallucinate or admit ignorance
Context engineering bridges the gap between static training and dynamic reality

PM's Role Shift: From feature builder → architect of informational ecosystems that ground AI in reality

Context Stuffing vs. Context Engineering

Dimension	Context Stuffing	Context Engineering
Mindset	Volume = quality	Structure = quality
Approach	"Add everything just in case"	"What decision am I making?"
Persistence	Persist all context	Retrieve with intent
Agent Chains	Share everything between agents	Bounded context per agent
Failure Response	Retry until it works	Fix the structure
Economic Model	Context as storage	Context as attention (scarce resource)

Critical Metaphor: Context stuffing is like bringing your entire file cabinet to a meeting. Context engineering is bringing only the 3 documents relevant to today's decision.

The Anti-Pattern: Context Stuffing

Five Markers of Context Stuffing:

Reflexively expanding context windows — "Just add more tokens!"
Persisting everything "just in case" — No clear retention criteria
Chaining agents without boundaries — Agent A passes everything to Agent B to Agent C
Adding evaluations to mask inconsistency — "We'll just retry until it's right"
Normalized retries — "It works if you run it 3 times" becomes acceptable

Why It Fails:

Reasoning Noise: Thousands of irrelevant files compete for attention, degrading multi-hop logic
Context Rot: Dead ends, past errors, irrelevant data accumulate → goal drift
Lost in the Middle: Models prioritize beginning (primacy) and end (recency), ignore middle
Economic Waste: Every query becomes expensive without accuracy gains
Quantitative Degradation: Accuracy drops below 20% when context exceeds ~32k tokens

The Hidden Costs:

Escalating token consumption
Diluted attention across irrelevant material
Reduced output confidence
Cascading retries that waste time and money

Real Context Engineering: Core Principles

Five Foundational Principles:

Context without shape becomes noise
Structure > Volume
Retrieve with intent, not completeness
Small working contexts (like short-term memory)
Context Compaction: Maximize density of relevant information per token

Quantitative Framework:

Efficiency = (Accuracy × Coherence) / (Tokens × Latency)

Key Finding: Using RAG with 25% of available tokens preserves 95% accuracy while significantly reducing latency and cost.

The 5 Diagnostic Questions (Detect Context Hoarding Disorder)

Ask these to identify context stuffing:

What specific decision does this support? — If you can't answer, you don't need it
Can retrieval replace persistence? — Just-in-time beats always-available
Who owns the context boundary? — If no one, it'll grow forever
What fails if we exclude this? — If nothing breaks, delete it
Are we fixing structure or avoiding it? — Stuffing context often masks bad information architecture

Memory Architecture: Two-Layer System

Short-Term (Conversational) Memory:

Immediate interaction history for follow-up questions
Challenge: Space management → older parts summarized or truncated
Lifespan: Single session

Long-Term (Persistent) Memory:

User preferences, key facts across sessions → deep personalization
Implemented via vector database (semantic retrieval)
Two types:
- Declarative Memory: Facts ("I'm vegan")
- Procedural Memory: Behavioral patterns ("I debug by checking logs first")
Lifespan: Persistent across sessions

LLM-Powered ETL: Models generate their own memories by identifying signals, consolidating with existing data, updating database automatically.

The Research → Plan → Reset → Implement Cycle

The Context Rot Solution:

Research: Agent gathers data → large, chaotic context window (noise + dead ends)
Plan: Agent synthesizes into high-density SPEC.md or PLAN.md (Source of Truth)
Reset: Clear entire context window (prevents context rot)
Implement: Fresh session using only the high-density plan as context

Why This Works: Context rot is eliminated; agent starts clean with compressed, high-signal context.

Anti-Patterns (What This Is NOT)

Not about choosing AI tools — Claude vs. ChatGPT doesn't matter; architecture matters
Not about writing better prompts — This is systems design, not copywriting
Not about adding more tokens — "Infinite context" narratives are marketing, not engineering reality
Not about replacing human judgment — Context engineering amplifies judgment, doesn't eliminate it

When to Use This Skill

✅ Use this when:

You're pasting entire PRDs/codebases into AI and getting vague responses
AI outputs are inconsistent ("works sometimes, not others")
You're burning tokens without seeing accuracy improvements
You suspect you're "context stuffing" but don't know how to fix it
You need to design context architecture for an AI product feature

❌ Don't use this when:

You're just getting started with AI (start with basic prompts first)
You're looking for tool recommendations (this is about architecture, not tooling)
Your AI usage is working well (if it ain't broke, don't fix it)

Facilitation Source of Truth

Use workshop-facilitation as the default interaction protocol for this skill.

It defines:

session heads-up + entry mode (Guided, Context dump, Best guess)
one-question turns with plain-language prompts
progress labels (for example, Context Qx/8 and Scoring Qx/5)
interruption handling and pause/resume behavior
numbered recommendations at decision points
quick-select numbered response options for regular questions (include Other (specify) when useful)

This file defines the domain-specific assessment content. If there is a conflict, follow this file's domain logic.

Application

This interactive skill uses adaptive questioning to diagnose context stuffing, identify boundaries, and provide tactical implementation guidance.

Step 0: Gather Context

Agent asks:

Before we diagnose your context practices, let's gather information:

Current AI Usage:

What AI tools/systems do you use? (ChatGPT, Claude, custom agents, etc.)
What PM tasks do you use AI for? (PRD writing, user research synthesis, discovery, etc.)
How do you provide context? (paste docs, reference files, use projects/memory)

Symptoms:

Are AI outputs inconsistent? (works sometimes, not others)
Are you retrying prompts multiple times to get good results?
Are responses vague or hedged despite providing "all the context"?
Are token costs escal

context-engineering-advisor

How to add

Drop this on your repo README

Related skills

claude-api

skill-creator

oh-my-issues

claude-mem

Get new Desenvolvimento skills every Monday