Claude Code Mastery

Build better agents, skills, and workflows with Claude Code. Battle-tested patterns from building Claude Code itself, used daily at Anthropic with hundreds of skills in production.

The Three Laws of Agent Design

Before anything else, internalize these:

The filesystem is how agents think. Write to disk, grep, process. Don't stuff 100 items into context.
Prompt caching is architecture, not optimization. Design your entire system around prefix stability.
Give Claude a way to verify its work. This alone 2-3x the quality of output.

When Building an Agent

Action Space Design

Consult tool-design reference for the complete tool design framework.

Most agent failures are tool design problems, not model problems. Design tools by imagining yourself solving the problem:

Paper = minimal (just text output). Limited but safe.
Calculator = specific tools (custom tool_use). More capable but rigid.
Computer = bash + filesystem. Most powerful, most flexible.

Start with bash. Claude Code started with 4 tools: read, write, edit, bash. That covers 80% of tasks. Add custom tools only when bash genuinely can't do the job.

Anti-pattern: 50 custom tools, one for each operation. This creates a "needle in haystack" problem — the model's reasoning degrades as tool count increases.

The right question: "What permissions and environment should I provide?" — not "What should I ask?"

The Agent Loop

Every agent follows three phases:

Gather Context — Use agentic search (grep, find, bash) before semantic search. Let Claude build its own context through progressive disclosure.
Take Action — Bash for flexible operations, custom tools for frequently-used primaries, MCP for external services.
Verify Work — Explicit rules with feedback (linting), visual feedback (screenshots), LLM-as-judge for fuzzy evaluation.

Consult agent-loop reference for detailed patterns per phase.

Progressive Disclosure — Don't Load Everything Upfront

Agents get dumber when you give them too much information upfront.

At startup, load only skill names and descriptions. Let Claude discover details when needed. This applies to:

Skills (name + trigger only, full content on demand)
MCP tools (stubs with defer_loading: true, full schema via ToolSearch)
Reference files (agent reads them when it decides to)
Data (write to files, grep when needed)

When Writing Skills

Skill Categories

Consult skill-categories reference for all 9 categories with examples.

The key categories:

Library & API Reference — How to use a library/CLI. Focus on gotchas, not obvious docs.
Product Verification — How to test and verify output. Worth a week of engineering.
Data Fetching & Analysis — Connect to data stacks with credentials and common queries.
Business Process Automation — Repetitive workflows as one command.
Code Scaffolding — Generate framework boilerplate with natural language requirements.
Code Quality & Review — Enforce org standards. Run as hooks or in CI.
CI/CD & Deployment — Fetch, push, deploy with testing and rollback.
Runbooks — Symptom -> investigation -> structured report.
Infrastructure Operations — Routine maintenance with safety guardrails.

Skill Structure

A skill is a folder, not a file:

skills/skill-name/
  SKILL.md              # Instructions + trigger conditions
  references/           # Detailed docs Claude reads on demand
    api.md              # Function signatures, usage examples
    gotchas.md          # Known failure points (THE highest-signal content)
    examples.md         # Good and bad output samples
  scripts/              # Helper bash/Node scripts
    verify.sh           # Verify output has required sections
    fetch.sh            # Pre-built data fetching
  assets/               # Templates, configs
    template.md         # Output template to copy
  config.json           # Skill metadata, setup data

The Description Field Is a Trigger, Not a Summary

Bad: "Generates meeting summaries" Good: "Use when a new meeting is detected or user asks to summarize, recap, or analyze a specific meeting. Relevant when user says 'what happened in', 'summarize the', 'meeting notes for', 'action items from'."

The description is what Claude scans to decide if this skill applies. Write it as triggering conditions.

Build a Gotchas Section

The highest-signal content in any skill is the Gotchas section.

Start small. Every time Claude fails at a task, add the failure mode:

## Gotchas
- API returns empty results for queries under 3 characters
- Date field is UTC, not local timezone
- Rate limit is 100/min, batch requests in groups of 50
- Transcript can be null even when notes exist

This is how skills get better over time. The gotchas section is a living document.

Don't Over-Specify Steps

Anti-pattern:

Step 1: Call API with query
Step 2: Take first 3 results
Step 3: Format as bullets
Step 4: Post to Slack

Correct pattern:

Search for relevant data, get details for the most relevant items,
and synthesize a clear answer. Reference sources by name and date.

Gotchas:
- Results capped at 50, paginate for exhaustive search
- Null fields are common, check before including

Tell Claude WHAT, not HOW. It's smart enough to figure out the steps.

Include Helper Scripts

Don't make Claude reconstruct boilerplate each time. Provide scripts it can compose:

# scripts/fetch-data.sh — Claude calls this via bash
#!/bin/bash
curl -s "$API_URL/search?q=$1" | jq '.results[:10]'

Claude spends tokens on composition and reasoning, not on rebuilding the fetch logic.

Store Execution History

data/standups.log        # Append-only log of every standup generated
data/last-digest.json    # State from last run

Claude reads its own history and can tell what's changed since the last run. Use ${CLAUDE_PLUGIN_DATA} for durable storage across skill upgrades.

When Structuring CLAUDE.md

A team should share a single CLAUDE.md, checked into git, updated multiple times a week. Anytime Claude does something incorrectly, add it.

Consult claude-md reference for the complete structure guide.

The Rules

< 200 lines. One engineer had 847 lines and got worse results than a 100-line version. Claude ignores bloated context.
Update on every mistake. End your prompt: "Update CLAUDE.md so this doesn't happen again."
Treat it like code. Review changes, test behavior, track in git.
Focus on what challenges Claude's defaults. Don't document what Claude already knows.

Recommended Sections

# Project Name

## What This Is
One paragraph. What, why, for whom.

## Tech Stack
Languages, frameworks, key dependencies.

## Build & Test
Commands to build, test, lint, deploy.

## Architecture
Key directories, entry points, data flow.

## Conventions
Naming, patterns, anti-patterns specific to this project.

## Gotchas
Things Claude gets wrong. Updated continuously.

When Optimizing Prompt Caching

At Anthropic, prompt caching is treated as critical infrastructure — alerts on cache hit rate, SEVs when they drop too low.

Consult prompt-caching reference for the complete technical guide.

The Ordering Rule

Static content first, dynamic content last:

1. System prompt + tool definitions    (globally cached)
2. CLAUDE.md / project context         (cached per project)
3. Session context / MEMORY.md         (cached per session)
4. Conversation messages               (new each turn)

The Five Commandments

Never change tools mid-session. Adding/removing a tool invalidates cache for the ENTIRE conversation.
**Ne

claude-mastery-expert

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

webapp-testing

brand-guidelines

frontend-design

web-artifacts-builder

Recibe nuevas skills de Design e Frontend todos los lunes