Google Gemini SDK Patterns

Quick Guide: Use the @google/genai package (the unified SDK, NOT the deprecated @google/generative-ai) for all Gemini API interactions. All operations flow through a central GoogleGenAI client with service accessors: ai.models for generation, ai.chats for multi-turn, ai.files for uploads, ai.caches for context caching. Use responseMimeType: "application/json" with responseJsonSchema for structured output. Access response text via response.text (property, not method). Streaming uses generateContentStream returning an async iterable -- iterate with for await.

<critical_requirements>

CRITICAL: Before Using This Skill

All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering, import type, named constants)

(You MUST use @google/genai (the new unified SDK) -- NOT the deprecated @google/generative-ai package)

(You MUST access response text via response.text (a property) -- NOT response.text() (the old SDK used a method call))

(You MUST pass model as a string parameter in every API call -- there is no getGenerativeModel() step)

(You MUST use config for all generation parameters (temperature, safetySettings, tools, systemInstruction) -- NOT top-level properties)

(You MUST never hardcode API keys -- use environment variables via process.env.GEMINI_API_KEY or GOOGLE_API_KEY)

</critical_requirements>

Auto-detection: Gemini, gemini, GoogleGenAI, @google/genai, ai.models.generateContent, generateContentStream, ai.chats, ai.files, ai.caches, gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-3-flash, gemini-embedding, GEMINI_API_KEY, GOOGLE_API_KEY, FunctionCallingConfigMode, createUserContent, createPartFromUri, responseMimeType, responseJsonSchema

When to use:

Building applications that call Google Gemini models directly (Gemini 2.x, 2.5, 3.x)
Processing multimodal input: images, video, audio, PDFs
Implementing function calling / tool use with custom functions or built-in tools (Google Search, code execution)
Extracting structured JSON data from LLM responses using response schemas
Streaming text generation for user-facing output
Creating embeddings for RAG pipelines or semantic search (text and multimodal)
Caching large context (documents, code) to reduce cost and latency across multiple requests
Multi-turn chat sessions with automatic history management

Key patterns covered:

Client initialization and environment-based configuration
Text generation with ai.models.generateContent()
Streaming with ai.models.generateContentStream() and for await
Multimodal input (inline base64, file upload, URIs)
Function calling with FunctionDeclaration and manual tool loops
Structured output with responseMimeType + responseJsonSchema + Zod
Chat sessions with ai.chats.create() and sendMessage()
Embeddings with ai.models.embedContent() (text and multimodal)
Context caching with ai.caches.create()
Safety settings per-request via config.safetySettings

When NOT to use:

Multi-provider applications requiring provider switching -- use a unified provider SDK
React-specific chat UI hooks (useChat) -- use a framework-integrated AI SDK
When you need features unique to another provider's API -- use that provider's SDK directly

Examples Index

Core: Setup & Configuration -- Client init, text generation, system instructions, error handling
Multimodal Input -- Inline images, file upload, video, audio, PDF, createPartFromUri
Streaming -- generateContentStream, sendMessageStream, abort patterns
Function Calling / Tools -- FunctionDeclaration, FunctionCallingConfigMode, manual tool loop, built-in tools
Structured Output -- JSON mode, Zod schemas, responseJsonSchema, enum extraction
Chat Sessions -- ai.chats.create(), multi-turn, streaming chat, history
Advanced: Embeddings, Caching & Safety -- Embeddings, context caching, safety settings, token counting
Quick API Reference -- Model IDs, method signatures, config parameters, safety enums

Philosophy

The @google/genai SDK is Google's unified client for the Gemini API and Vertex AI. It replaces the deprecated @google/generative-ai package with a cleaner, centralized architecture.

Core principles:

Centralized client -- A single GoogleGenAI instance provides all API services via ai.models, ai.chats, ai.files, ai.caches. No scattered manager classes.
Model-per-call -- Pass the model ID string in every API call rather than binding to a model instance. This simplifies multi-model usage.
Config object pattern -- All generation parameters (temperature, systemInstruction, tools, safetySettings) go inside a config object, keeping the top-level call clean.
Native multimodal -- Images, video, audio, and PDFs are first-class inputs via inline data or file upload. Gemini models handle all modalities natively.
Response as property -- Access response.text as a property (not a method). Access response.functionCalls for tool calls.

When to use the Gemini SDK directly:

You primarily use Google Gemini models
You need multimodal input (images, video, audio, PDF) as a core feature
You want built-in tools like Google Search and code execution
You need context caching for large documents
You want the simplest path to Gemini API features

When NOT to use:

You need to switch between multiple providers -- use a unified SDK
You want React-specific chat hooks -- use a framework-integrated AI SDK
You need features unique to another provider's API -- use that provider's SDK directly

</philosophy>

Core Patterns

Pattern 1: Client Setup

Initialize the GoogleGenAI client. It can auto-read GOOGLE_API_KEY from the environment.

// lib/gemini.ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

export { ai };

// Auto-reads GOOGLE_API_KEY from environment
const ai = new GoogleGenAI({});

Why good: Minimal setup, env var auto-detected, named export

// BAD: Using the old deprecated SDK
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI("hardcoded-key"); // WRONG
const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });

Why bad: Old deprecated package, hardcoded API key, model binding step no longer needed

See: examples/core.md for Vertex AI setup, environment variables, error handling

Pattern 2: Text Generation

Pass model and contents directly -- no getGenerativeModel() step.

const response = await ai.models.generateContent({
  model: "gemini-2.5-flash",
  contents: "Explain TypeScript generics briefly.",
  config: {
    systemInstruction: "You are a concise coding tutor.",
    temperature: 0.3,
  },
});
console.log(response.text);

Why good: Model specified per-call, system instruction in config, response.text as property

// BAD: Old SDK patterns that don't work
const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });
const result = await model.generateContent("Hello");
console.log(result.response.text()); // text() was a method in old SDK

Why bad: getGenerativeModel() doesn't exist in new SDK, text() is a property not a method

See: examples/core.md for system instructions, temperature, thinking config

Pattern 3: Streaming

Use generateContentStream and iterate with for await.

const response = await ai.models.gener

ai-provider-google-gemini-sdk

How to add

Drop this on your repo README

Related skills

webapp-testing

brand-guidelines

frontend-design

web-artifacts-builder

Get new Design e Frontend skills every Monday