Weaviate Query Agent Skill
This skill helps you search and retrieve data from your local Weaviate collections using semantic vector search, keyword search, filters, and RAG capabilities.
Important Note
This skill is designed for LOCAL Weaviate instances only. Ensure you have Weaviate running locally in Docker before using this skill.
Purpose
Query your local Weaviate collections intelligently to find relevant information, perform Q&A, and analyze your data.
When to Use This Skill
- User wants to search for information in a collection
- User asks questions that need semantic search
- User needs to filter results by specific criteria
- User wants to use RAG (Retrieval Augmented Generation) for Q&A
- User asks about finding similar items
- User needs to combine vector search with filters
Prerequisites Check
Claude should verify these prerequisites before proceeding:
- ✅ weaviate-local-setup completed - Python environment and dependencies installed
- ✅ weaviate-connection completed - Successfully connected to Weaviate
- ✅ weaviate-data-ingestion used - Collection has data to query
- ✅ Docker container running - Weaviate is accessible at localhost:8080
If any prerequisites are missing, Claude should:
- Load the required prerequisite skill first
- Guide the user through the setup
- Then return to this skill
Prerequisites
- Local Weaviate running in Docker (see weaviate-local-setup skill)
- Active Weaviate connection (use weaviate-connection skill first)
- Collection with data (use weaviate-data-ingestion skill to add data)
- Python weaviate-client library installed
Query Types
1. Semantic Search (Vector Search)
Find objects semantically similar to your query:
import weaviate
# Assuming client is already connected
collection = client.collections.get("Articles")
# Search by meaning
response = collection.query.near_text(
query="artificial intelligence and machine learning",
limit=5
)
# Display results
for obj in response.objects:
print(f"Title: {obj.properties['title']}")
print(f"Content: {obj.properties['content'][:200]}...")
print(f"Score: {obj.metadata.score}\n")
2. Search with Specific Properties
Return only the fields you need:
response = collection.query.near_text(
query="vector databases",
limit=5,
return_properties=["title", "author", "publishDate"]
)
for obj in response.objects:
print(f"{obj.properties['title']} by {obj.properties['author']}")
3. Keyword Search (BM25)
Traditional keyword-based search:
from weaviate.classes.query import QueryReference
response = collection.query.bm25(
query="vector search",
limit=5
)
for obj in response.objects:
print(f"Title: {obj.properties['title']}")
4. Hybrid Search (Best of Both Worlds)
Combine semantic and keyword search:
response = collection.query.hybrid(
query="machine learning applications",
limit=5,
alpha=0.5 # 0 = pure BM25, 1 = pure vector, 0.5 = balanced
)
for obj in response.objects:
print(f"Title: {obj.properties['title']}")
print(f"Score: {obj.metadata.score}\n")
5. Filter Results
Search with conditions:
from weaviate.classes.query import Filter
# Search with author filter
response = collection.query.near_text(
query="AI advancements",
limit=5,
filters=Filter.by_property("author").equal("Jane Smith")
)
# Multiple filters
response = collection.query.near_text(
query="technology trends",
limit=10,
filters=(
Filter.by_property("author").equal("Jane Smith") &
Filter.by_property("publishDate").greater_than("2024-01-01T00:00:00Z")
)
)
# Filter by array contains
response = collection.query.near_text(
query="programming",
filters=Filter.by_property("tags").contains_any(["python", "javascript"])
)
6. Filter Operators
from weaviate.classes.query import Filter
# Equality
Filter.by_property("status").equal("published")
# Comparison
Filter.by_property("price").greater_than(100)
Filter.by_property("price").less_than(500)
Filter.by_property("price").greater_or_equal(100)
Filter.by_property("price").less_or_equal(500)
# String matching
Filter.by_property("title").like("*vector*") # Contains "vector"
# Array operations
Filter.by_property("tags").contains_any(["ai", "ml"])
Filter.by_property("tags").contains_all(["python", "tutorial"])
# Combine filters
(Filter.by_property("price").greater_than(100) &
Filter.by_property("category").equal("Electronics"))
# OR conditions
(Filter.by_property("author").equal("John") |
Filter.by_property("author").equal("Jane"))
7. Search by Image (Multi-modal)
For collections with CLIP or multi2vec:
import base64
# Encode query image
with open("query_image.jpg", "rb") as f:
query_image = base64.b64encode(f.read()).decode("utf-8")
collection = client.collections.get("ProductCatalog")
# Find similar images
response = collection.query.near_image(
near_image=query_image,
limit=5,
return_properties=["name", "description", "price"]
)
for obj in response.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")
8. Search by Vector
If you have a pre-computed embedding:
# Your custom embedding
query_vector = [0.1, 0.2, 0.3, ...] # 1536 dimensions for OpenAI
response = collection.query.near_vector(
near_vector=query_vector,
limit=5
)
9. Get Object by ID
Retrieve specific object:
# Get by UUID
obj = collection.query.fetch_object_by_id("uuid-here")
print(f"Title: {obj.properties['title']}")
print(f"Content: {obj.properties['content']}")
10. Fetch Multiple Objects
Get all objects or filter by property:
from weaviate.classes.query import Filter
# Get all objects (paginated)
response = collection.query.fetch_objects(limit=100)
# Get objects matching filter
response = collection.query.fetch_objects(
filters=Filter.by_property("section").equal("Introduction"),
limit=50
)
for obj in response.objects:
print(obj.properties['title'])
RAG (Retrieval Augmented Generation)
Use Weaviate's generative module for Q&A:
Single Prompt RAG
# Collection must have generative module configured
collection = client.collections.get("TechnicalDocuments")
response = collection.generate.near_text(
query="How do I configure HVAC systems?",
single_prompt="Answer this question based on the context: {question}. Context: {content}",
limit=3
)
# Get generated answer
print(f"Answer: {response.generated}")
# See source documents
for obj in response.objects:
print(f"\nSource: {obj.properties['title']}")
print(f"Content: {obj.properties['content'][:200]}...")
Grouped Task RAG
Generate one response using all results:
response = collection.generate.near_text(
query="What are the best practices for fan selection?",
grouped_task="Summarize the key recommendations from these documents about fan selection",
limit=5
)
print(response.generated)
Custom RAG Implementation
If collection doesn't have generative module:
from openai import OpenAI
# Search Weaviate
weaviate_collection = weaviate_client.collections.get("TechnicalDocuments")
search_results = weaviate_collection.query.near_text(
query="What is the friction loss for round elbows?",
limit=5,
return_properties=["content", "section", "page"]
)
# Build context
context = "\n\n".join([
f"[{obj.properties['section']} - Page {obj.properties['page']}]\n{obj.properties['content']}"
for obj in search_results.objects
])
# Call LLM
openai_client = OpenAI()
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You are a technical assistant. Answer questions based on the provided context."