Mistral SDK Patterns
Quick Guide: Use
@mistralai/mistralai(ESM-only) to interact with Mistral's API. Useclient.chat.complete()for chat,client.chat.stream()for streaming (async iterable viafor await),client.chat.parse()with a Zod schema for structured outputs, andclient.fim.complete()for Codestral fill-in-middle code completion. The SDK usesresponseFormat(camelCase) notresponse_format. Streaming events expose content viaevent.data.choices[0]?.delta?.content. Retries default tostrategy: "none"-- you must configure them explicitly for production.
<critical_requirements>
CRITICAL: Before Using This Skill
All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering,
import type, named constants)
(You MUST use responseFormat (camelCase) in SDK calls -- NOT response_format (snake_case). The SDK uses camelCase property names throughout.)
(You MUST configure retries explicitly -- the SDK defaults to strategy: "none" (no retries), unlike OpenAI's SDK which retries automatically)
(You MUST consume streaming results with for await (const event of result) and access content via event.data.choices[0]?.delta?.content -- the event shape differs from OpenAI)
(You MUST never hardcode API keys -- use process.env["MISTRAL_API_KEY"] with the bracket notation the SDK documents)
(You MUST use client.chat.parse() with a Zod schema for structured outputs -- NOT manual JSON.parse() on completion content)
</critical_requirements>
Auto-detection: Mistral, mistral, @mistralai/mistralai, client.chat.complete, client.chat.stream, client.chat.parse, client.fim.complete, client.embeddings.create, mistral-large, mistral-small, codestral, pixtral, ministral, magistral, devstral, MISTRAL_API_KEY, responseFormat, mistral-embed
When to use:
- Building applications that call Mistral models directly (Mistral Large, Small, Codestral, etc.)
- Implementing chat completions with SSE streaming
- Using Codestral for code generation and fill-in-middle (FIM) completion
- Extracting structured data with
client.chat.parse()and Zod schemas - Implementing function calling / tool use
- Creating embeddings for RAG pipelines or semantic search
- Processing images with vision-capable models (Mistral Small, Medium, Large, Ministral)
- Using Mistral Agents API for pre-configured agent completions
Key patterns covered:
- Client initialization and configuration (retries, timeouts, custom HTTP client)
- Chat completions (
chat.complete) and streaming (chat.stream) - Structured outputs with
chat.parse()and Zod schemas - Function calling / tool use with tool call loop
- Embeddings (
embeddings.create) withmistral-embed - Vision (image URL / base64 with vision-capable models)
- Codestral FIM (
fim.complete) for code completion - Error handling, retry configuration, and production patterns
When NOT to use:
- Multi-provider applications where you need to switch between Mistral, OpenAI, Anthropic, etc. -- use a unified provider SDK
- React-specific chat UI hooks (
useChat) -- use a framework-integrated AI SDK - When you need OpenAI-compatible endpoints -- use OpenAI SDK with Mistral's compatible endpoint instead
Examples Index
- Core: Setup & Configuration -- Client init, production config, error handling, retries, custom HTTP client
- Chat & Streaming -- Chat completions, streaming with async iteration, multi-turn
- Structured Output --
chat.parse()with Zod, JSON mode, typed responses - Function Calling -- Tool definitions, tool call loop, streaming tools
- Embeddings & Vision -- Semantic search, image analysis with vision-capable models
- Codestral FIM -- Fill-in-middle code completion, code generation
- Quick API Reference -- Model IDs, method signatures, error types, configuration options
<philosophy>
Philosophy
The @mistralai/mistralai SDK is auto-generated from Mistral's OpenAPI spec using Speakeasy, giving you a thin, type-safe wrapper over the REST API. It is ESM-only and uses camelCase property names (not snake_case like the REST API).
Core principles:
- ESM-only -- The package is published as ESM only. CommonJS projects must use
await import(). This is a hard constraint, not optional. - camelCase API surface -- SDK properties use camelCase (
responseFormat,maxTokens,toolChoice) even though the REST API uses snake_case. This catches OpenAI SDK migrants who writeresponse_format. - No automatic retries -- Unlike OpenAI's SDK (2 retries by default), Mistral defaults to
strategy: "none". You must configure retries explicitly for production. - Streaming via async iterables --
chat.stream()returns anEventStreamconsumed withfor await...of. Events have adatawrapper:event.data.choices[0]?.delta?.content. - Structured outputs via
chat.parse()-- Pass a Zod schema directly toresponseFormatand accessmessage.parsedfor typed results. No manual JSON schema construction needed. - Codestral FIM -- Dedicated
fim.complete()endpoint for fill-in-middle code completion, separate from chat.
When to use the Mistral SDK directly:
- You only use Mistral models and want the simplest, most direct integration
- You need Mistral-specific features (Codestral FIM, Mistral Agents, Voxtral audio)
- You want minimal dependencies and zero abstraction overhead
- You need the latest Mistral API features on day one
When NOT to use:
- You need to switch between providers (OpenAI, Anthropic, Mistral) -- use a unified provider SDK
- You want React-specific chat UI hooks -- use a framework-integrated AI SDK
- You want an OpenAI-compatible wrapper -- Mistral exposes an OpenAI-compatible endpoint, use the OpenAI SDK for that
<patterns>
Core Patterns
Pattern 1: Client Setup
Initialize the Mistral client. It reads MISTRAL_API_KEY from the environment.
// lib/mistral.ts -- basic setup
import { Mistral } from "@mistralai/mistralai";
const client = new Mistral({
apiKey: process.env["MISTRAL_API_KEY"] ?? "",
});
export { client };
// lib/mistral.ts -- production configuration
import { Mistral } from "@mistralai/mistralai";
const TIMEOUT_MS = 30_000;
const client = new Mistral({
apiKey: process.env["MISTRAL_API_KEY"] ?? "",
timeoutMs: TIMEOUT_MS,
retryConfig: {
strategy: "backoff",
backoff: {
initialInterval: 1_000,
maxInterval: 30_000,
exponent: 1.5,
maxElapsedTime: 120_000,
},
retryConnectionErrors: true,
},
});
export { client };
Why good: Explicit retry config (SDK defaults to no retries), named constants, env var with bracket notation
See: examples/core.md for custom HTTP client, async API key provider, error handling
Pattern 2: Chat Completions
Basic chat using chat.complete().
const result = await client.chat.complete({
model: "mistral-large-latest",
messages: [
{ role: "system", content: "You are a helpful coding assistant." },
{ role: "user", content: "Explain TypeScript generics." },
],
});
const content = result?.choices?.[0]?.message?.content;
console.log(content);
Why good: Uses system role for instructions, safe optional chaining on nullable response
// BAD: Using snake_case properties (REST API style, not SDK style)
const result = await client.chat.complete({
model: "mistral-large-latest",
messages: [{ role: "user", content: "hello" }],
response_format: { type: "json_object" }, // WRONG: use responseFormat
max_tokens: 100, // WRONG: use maxTokens
});
Why bad: SDK uses camelCase properties -- response_format and max_tokens