Weaviate Patterns
Quick Guide: Use Weaviate for semantic search and RAG applications. Use weaviate-client (v3.x) as the TypeScript client -- it uses gRPC for performance and provides full type safety with generics. Connect via
connectToWeaviateCloud()for managed instances orconnectToLocal()for Docker. Collections are the central abstraction -- configure vectorizers at collection level, not per-query. Usecollection.query.*for search,collection.generate.*for RAG, andcollection.data.*for CRUD. Always callclient.close()when done. Increase query timeout to 60s+ when using generative search. The v3 client does NOT support browsers or Embedded Weaviate.
<critical_requirements>
CRITICAL: Before Using This Skill
All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering,
import type, named constants)
(You MUST call client.close() when done with the Weaviate client -- it maintains persistent gRPC connections that will leak if not closed)
(You MUST configure vectorizers at the COLLECTION level during client.collections.create() -- you cannot add a vectorizer after creation, only add new named vectors)
(You MUST use a SEPARATE client.collections.use() call with .withTenant() for multi-tenant queries -- queries without tenant context on multi-tenant collections will fail)
(You MUST increase query timeout to 60+ seconds when using generate.* (RAG) submodule -- generative model calls are slow and the default timeout causes failures)
</critical_requirements>
Examples
- Core Patterns -- Connection, collection setup, object CRUD, basic search
- Search & Filtering -- nearText, nearVector, hybrid, bm25, filters, generative search (RAG)
- Multi-Tenancy & Batch -- Tenant management, batch imports, cross-references
Additional resources:
- reference.md -- API cheat sheet, vectorizer comparison, data types, decision frameworks
Auto-detection: Weaviate, weaviate-client, connectToWeaviateCloud, connectToLocal, nearText, nearVector, hybrid search, bm25, vector database, semantic search, RAG, generative search, generate.nearText, insertMany, vectorizer, text2vec, multi-tenancy, withTenant, collection.query, collection.generate, collection.data
When to use:
- Semantic search over text, images, or multimodal data
- Retrieval Augmented Generation (RAG) with built-in generative search
- Hybrid search combining vector similarity and keyword (BM25) ranking
- Multi-tenant applications needing isolated vector stores per customer
- Applications requiring built-in vectorization (no external embedding pipeline)
- Real-time similarity search with filtering on structured properties
Key patterns covered:
- weaviate-client v3 connection setup and configuration
- Collection management with vectorizer modules (text2vec-openai, text2vec-cohere, etc.)
- Object CRUD (insert, insertMany, update, replace, deleteById, deleteMany)
- Search types (nearText, nearVector, hybrid, bm25, fetchObjects)
- Filtering with operators (equal, greaterThan, like, containsAny, and/or/not)
- Generative search (RAG) with singlePrompt and groupedTask
- Multi-tenancy with tenant lifecycle management
- Batch imports with insertMany and error handling
- Cross-references between collections
- Named vectors for multi-vector collections
When NOT to use:
- Relational data with complex joins (use a relational database)
- Full-text search without vector component (use a dedicated search engine)
- Key-value caching (use a key-value store)
- Time-series data (use a time-series database)
- Graph traversal queries (use a graph database)
- Browser-side applications (v3 client is Node.js only)
<philosophy>
Philosophy
Weaviate is a vector database that stores data objects alongside their vector embeddings. The core principle: configure once at the collection level, then query with simple method calls.
Core principles:
- Collection-centric design -- All configuration (vectorizer, generative model, reranker, properties) is set at collection creation. Queries operate on collection objects obtained via
client.collections.use(). - Built-in vectorization -- Weaviate can vectorize data automatically using configured modules (text2vec-openai, text2vec-cohere, etc.). You don't need an external embedding pipeline unless you want one.
- Search is a spectrum -- Use
nearTextfor semantic similarity,bm25for keyword matching,hybridfor a weighted combination. Thealphaparameter controls the vector-vs-keyword balance in hybrid search. - RAG is a search mode, not a separate system -- Switch from
collection.query.nearText()tocollection.generate.nearText()to add LLM generation on top of search results. - Filters are additive -- Filters narrow results after vector/keyword retrieval. Combine with
Filters.and()andFilters.or()for complex conditions.
<patterns>
Core Patterns
Pattern 1: Connection Setup
Connect to Weaviate Cloud or local Docker instance. Always close the client when done. See examples/core.md for full examples.
// Good Example -- Cloud connection with API key headers
import weaviate from "weaviate-client";
const QUERY_TIMEOUT_SECONDS = 30;
const INSERT_TIMEOUT_SECONDS = 120;
async function createWeaviateClient() {
const client = await weaviate.connectToWeaviateCloud(
process.env.WEAVIATE_URL!,
{
authCredentials: new weaviate.ApiKey(process.env.WEAVIATE_API_KEY!),
headers: {
"X-OpenAI-Api-Key": process.env.OPENAI_API_KEY!,
},
timeout: {
query: QUERY_TIMEOUT_SECONDS,
insert: INSERT_TIMEOUT_SECONDS,
},
},
);
return client;
}
export { createWeaviateClient };
Why good: Environment variables for credentials, explicit timeouts, API key headers for vectorizer modules
// Bad Example -- Missing cleanup, no timeout config
import weaviate from "weaviate-client";
const client = await weaviate.connectToLocal();
// No client.close() -- gRPC connections leak
// No timeout config -- generative queries will timeout
Why bad: Missing client.close() leaks gRPC connections, default timeout too short for RAG queries
Pattern 2: Collection with Vectorizer
Configure vectorizer and properties at creation time. See examples/core.md for named vectors and advanced configuration.
// Good Example -- Collection with vectorizer and generative model
import { vectors, dataType, generative } from "weaviate-client";
await client.collections.create({
name: "Article",
vectorizers: vectors.text2VecOpenAI({
model: "text-embedding-3-small",
}),
generative: generative.openAI({
model: "gpt-4o",
}),
properties: [
{ name: "title", dataType: dataType.TEXT },
{ name: "body", dataType: dataType.TEXT },
{ name: "category", dataType: dataType.TEXT },
{ name: "publishedAt", dataType: dataType.DATE },
],
});
Why good: Vectorizer and generative model configured at collection level, typed properties with explicit data types
// Bad Example -- Trying to add vectorizer after creation
await client.collections.create({ name: "Article" });
// No way to add a vectorizer to an existing collection without named vectors
// Must delete and recreate, or use addVector() for named vectors only
Why bad: Vectorizer must be set at creation time; cannot be added to an existing default vector after the fact
Pattern 3: Hybrid Search with Filters
Combine vector and keyword search with property filters. See examples/search.md for all search types.
// Good Example -- Hybrid search with filter
import { Filters } from "weaviate-client";
const articles = clie