Claude API (Anthropic Messages API)
Status: Production Ready | SDK: @anthropic-ai/sdk@0.70.1
Quick Start (5 Minutes)
Node.js
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const message = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude!' },
],
});
console.log(message.content[0].text);
Cloudflare Workers
const response = await fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
headers: {
'x-api-key': env.ANTHROPIC_API_KEY,
'anthropic-version': '2023-06-01',
'content-type': 'application/json',
},
body: JSON.stringify({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello!' }],
}),
});
const data = await response.json();
console.log(data.content[0].text);
Load references/setup-guide.md for complete setup with streaming, caching, and tools.
Critical Rules
Always Do ✅
- Use environment variables for API keys (NEVER hardcode)
- Set max_tokens explicitly (required parameter)
- Pin model version (
claude-sonnet-4-5-20250929, notclaude-3-5-sonnet-latest) - Enable prompt caching for repeated content (90% cost savings)
- Stream long responses (
stream: true) for better UX - Handle errors - Implement retry logic for 429, 529 errors
- Validate inputs - Sanitize user messages before sending
- Monitor costs - Track token usage
- Set timeouts - Prevent hanging requests
- Use tool use properly - Return tool_result in follow-up message
Never Do ❌
- Never expose API key in client-side code
- Never skip max_tokens - API will error without it
- Never ignore stop_reason - Check for
tool_use,end_turn,max_tokens - Never assume single content block -
contentis an array - Never use outdated models - Pin to specific version
- Never skip error handling - API calls can fail
- Never mix message roles - Alternate user/assistant correctly
- Never ignore rate limits - Implement exponential backoff
- Never store API keys in logs or databases
- Never skip input validation - Prevent injection attacks
Top 3 Errors (Prevent 80% of Issues)
Error #1: Rate Limit 429
Symptom: 429 Too Many Requests: Number of request tokens has exceeded your per-minute rate limit
Solution: Implement exponential backoff with retry-after header
async function handleRateLimit(requestFn, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await requestFn();
} catch (error) {
if (error.status === 429) {
const retryAfter = error.response?.headers?.['retry-after'];
const delay = retryAfter ? parseInt(retryAfter) * 1000 : 1000 * Math.pow(2, attempt);
await new Promise(resolve => setTimeout(resolve, delay));
} else {
throw error;
}
}
}
}
Prevention: Monitor rate limit headers, upgrade tier, implement backoff
Error #2: Prompt Caching Not Activating
Symptom: High costs despite cache_control blocks, cache_read_input_tokens: 0
Solution: Place cache_control on LAST block with >= 1024 tokens
// ❌ Wrong - cache_control not at end
{
type: 'text',
text: DOCUMENT,
cache_control: { type: 'ephemeral' }, // Wrong position
},
{
type: 'text',
text: 'Additional text',
}
// ✅ Correct - cache_control at end
{
type: 'text',
text: DOCUMENT + '\n\nAdditional text',
cache_control: { type: 'ephemeral' }, // Correct position
}
Prevention: Ensure content >= 1024 tokens, keep cached content identical, monitor usage
Load references/prompt-caching-guide.md for complete caching strategy.
Error #3: Tool Use Response Format Errors
Symptom: invalid_request_error: tools[0].input_schema is invalid
Solution: Valid tool schema with proper JSON Schema
// ✅ Valid tool schema
{
name: 'get_weather',
description: 'Get current weather',
input_schema: {
type: 'object', // Must be 'object'
properties: {
location: {
type: 'string', // Valid JSON Schema types
description: 'City' // Optional but recommended
}
},
required: ['location'] // List required fields
}
}
// ✅ Valid tool result
{
type: 'tool_result',
tool_use_id: block.id, // Must match tool_use id
content: JSON.stringify(result) // Convert to string
}
Prevention: Validate schemas, match tool_use_id exactly, stringify results
Load references/tool-use-patterns.md + references/top-errors.md for all 12 errors.
Common Use Cases (Quick Patterns)
Streaming Responses
const stream = await client.messages.stream({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Write a story.' }],
});
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
process.stdout.write(event.delta.text);
}
}
Load: templates/streaming-chat.ts
Prompt Caching (90% Cost Savings)
const message = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
system: [
{
type: 'text',
text: 'Long system prompt...',
cache_control: { type: 'ephemeral' },
},
],
messages: [{ role: 'user', content: 'Question?' }],
});
Cache lasts 5 minutes, 90% savings on cached tokens
Load: references/prompt-caching-guide.md + templates/prompt-caching.ts
Tool Use (Function Calling)
const message = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
tools: [{
name: 'get_weather',
description: 'Get weather for a location',
input_schema: {
type: 'object',
properties: { location: { type: 'string' } },
required: ['location'],
},
}],
messages: [{ role: 'user', content: 'Weather in SF?' }],
});
if (message.stop_reason === 'tool_use') {
const toolUse = message.content.find(b => b.type === 'tool_use');
// Execute tool and send result back...
}
Load: references/tool-use-patterns.md + templates/tool-use-basic.ts
Vision (Image Understanding)
const message = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 1024,
messages: [{
role: 'user',
content: [
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/jpeg',
data: base64Image,
},
},
{ type: 'text', text: 'What is in this image?' },
],
}],
});
Supports: JPEG, PNG, WebP, GIF (max 5MB)
Load: references/vision-capabilities.md + templates/vision-image.ts
Extended Thinking Mode
const message = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
max_tokens: 4096,
thinking: {
type: 'enabled',
budget_tokens: 2000,
},
messages: [{ role: 'user', content: 'Solve complex problem...' }],
});
const thinking = message.content.find(b => b.type === 'thinking')?.thinking;
const answer = message.content.find(b => b.type === 'text')?.text;
Load: templates/extended-thinking.ts
Model Versions (Current)
Latest models:
claude-sonnet-4-5-20250929- Recommended (best performance)claude-sonnet-4-20250514- Stable versionclaude-3-7-sonnet-20250219- Previous generationclaude-3-5-sonnet-20241022- Legacy
Always pin to specific version (not -latest suffix)
When to Load References
Load references/setup-guide.md when:
- First-time Claude API user