AgentSkill Expertise - Agent Skill Knowledge System
Overview
The complete knowledge system for Agent Skills. Not a how-to guide (that's what official docs are for) — this focuses on three things: why Skills are designed the way they are, what good design looks like, and where the common pitfalls are.
Any work involving Skills — creating, adjusting, reviewing, splitting, merging — should use this as the decision framework.
Core Understanding: What a Skill Actually Is
Three Levels of Understanding
| Level | Understanding | What You Can Do | Common Issue |
|---|---|---|---|
| Surface | "Skills are a way to make AI remember instructions" | Write syntactically correct Skills | Treats Skills as fancy System Prompts |
| Mechanical | "Skills implement Progressive Disclosure" | Understands trigger mechanism and load flow | Technically correct but poorly designed |
| Essential | "Skills externalize knowledge into a format that AI can actively discover and load on demand" | Design Skills that actually work | — |
Key Insights
- Not a rule set: A single value covers a wide range of rules. Give principles, not rules.
- Not a prompt wrapper: Skills have a metadata pre-loading mechanism — they're in context before conversation starts. Fundamentally different from prompts.
- Not a tool feature: It's a standardized format for knowledge management. Write once, works across platforms, and the accumulated knowledge only grows in value.
- Official positioning: "Domain Expertise" — not instructions, not workflows, but knowledge.
Underlying Mechanisms
Progressive Disclosure
Level 1: metadata (name + description, ~100 tokens)
→ Present from session start, always in context
→ The only thing AI uses to decide whether to load a Skill
Level 2: SKILL.md full content (<5k tokens)
→ Loaded only when AI judges it relevant
Level 3+: references/, scripts/, assets/
→ Loaded only when more detail is needed
State Difference at Conversation Start
| Traditional Approach | Skill Architecture | |
|---|---|---|
| When conversation starts | AI starts with an empty slate | AI has already loaded all Skill metadata |
| What the user needs to do | Copy-paste every time / write "remember to read X" / stuff everything into CLAUDE.md | Nothing |
| Principle | Not mentioned = not known | Before the conversation starts, AI is already prepared |
This is the fundamental difference — not what happens during the conversation, but that the state is already different before it begins.
The Description Mechanism
- AI uses semantic matching to decide whether to load, not keyword comparison
- Description is the only content read at Level 1
- The system enforces a 15,000 character budget for the
<available_skills>block (shared across all Skills)
Known bias: AI will err on the side of not triggering rather than over-triggering. A conservative description = a Skill that effectively doesn't exist.
Anthropic skill-creator: "Currently Claude has a tendency to 'undertrigger' skills — to not use them when they'd be useful. To combat this, please make the skill descriptions a little bit 'pushy'."
Writing principles:
- ❌ Feature-oriented: "This is a writing skill" → Too vague, AI will skip it
- ✅ Situation-oriented: "Trigger when conclusion-first, scannable, professional-tone technical writing is needed" → Clear trigger condition
- ✅ Explicitly enumerate trigger scenarios: "even if they don't explicitly ask for X" → Reduces missed triggers
- ✅ Include update triggers: "AI should proactively update when: X happens → update Y" → AI knows when to iterate
Less Is More
"Find the smallest set of high-signal tokens that maximize likelihood of desired outcome." — Anthropic official
"The biggest performance gains didn't come from adding complex RAG pipelines. The gains came from removing things." — Manus team (100+ tools causes Context Confusion — AI hallucinates parameters or calls the wrong tool)
"Keep the prompt lean. Remove things that aren't pulling their weight." — Anthropic skill-creator
Conclusion: The design principle for Skills is lean, chunked, and loaded on demand. Every instruction should pull its weight — if it can't justify its existence, remove it.
Design Principles
Principle 1: Give Principles, Not Rules
| Approach | Example | Result |
|---|---|---|
| Rule | "Don't use words like 'explosive' or 'shocking'" | AI finds another way to write it; the anxiety-inducing tone remains |
| Principle | "I don't want the writing to feel anxious" | AI understands the why, automatically avoids all anxiety-inducing patterns |
| Deeper | "Be straightforward and factual" (underlying value) | One value covers a wide range of rules |
Anthropic skill-creator: "Try hard to explain the why behind everything you're asking the model to do. Today's LLMs are smart. They have good theory of mind and when given a good harness can go beyond rote instructions and really make things happen... If you find yourself writing ALWAYS or NEVER in all caps, that's a yellow flag — reframe and explain the reasoning."
Instant diagnostic: Writing ALWAYS / NEVER in all caps? Stop. Step back and ask "what outcome do I want? Why?" Write the why instead of the imperative.
Diagnostic test: If your SKILL.md has 50+ specific rules, they can probably be distilled into a few principles.
Principle 2: Passive Loading > Active Guidance
Writing a long list of "remember to read X" and "remember to read Y" in CLAUDE.md means if you forget to write it, it won't be read — and CLAUDE.md keeps growing. Better approach: turn it into a Skill, let metadata pre-load naturally — AI automatically judges when it's needed and loads on demand.
Application: Any "remember to read X" need should be considered for conversion into Skill references.
Principle 3: Design Method > Copying Templates
- Other people's Skills solve other people's problems — copying them wholesale may not fit you
- Correct approach: start from your own collaboration pain points, use the five-stage design process
Principle 4: Split First, Merge Later
- When uncertain about granularity, create a separate Skill for each scenario first
- As you work, if you discover overlapping core principles → merge into one Skill with situation-based branching
- Don't try to figure out the "optimal granularity" upfront
Principle 5: Protocol and Persona Separation
When building an AI assistant, "how to behave" and "who it is" should be two independent Skills:
CLAUDE.md (lean, only the entry point)
└── "Before responding, invoke /agent-protocols"
.claude/skills/
├── agent-protocols/ ← how to behave (protocols)
│ ├── SKILL.md (description: "Required reading at session start...")
│ └── references/ (startup.md, memory.md, safety.md)
└── persona/ ← who it is (persona) ← can be copied wholesale to other projects
├── SKILL.md
└── references/ (soul.md, identity.md)
Benefits: lean CLAUDE.md, portable persona, protocols and persona can be flexibly combined.
Design Process: Five Stages
| Stage | Core Question | Right Approach | Common Mistake |
|---|---|---|---|
| 1. Empathize | When does AI lose its way? | Start from "collaboration pain points" | Start from "what I want" |
| 2. Define | What problem does this Skill solve? | Clear, measurable design goal | Vague wish |
| 3. Ideate | How to write metadata and principles? | Generate 3+ options, then pick the best | Accept the first idea |
| 4. Prototype | Does the MV Skill work? | Minimum viable version, test first then iterate | Perfectionism |
| 5. Test | Found? Understood? Solved? | Validate all three design assumptions | Only ask "are there bugs?" |