Weaviate Query Agent Skill

This skill helps you search and retrieve data from your local Weaviate collections using semantic vector search, keyword search, filters, and RAG capabilities.

Important Note

This skill is designed for LOCAL Weaviate instances only. Ensure you have Weaviate running locally in Docker before using this skill.

Purpose

Query your local Weaviate collections intelligently to find relevant information, perform Q&A, and analyze your data.

When to Use This Skill

User wants to search for information in a collection
User asks questions that need semantic search
User needs to filter results by specific criteria
User wants to use RAG (Retrieval Augmented Generation) for Q&A
User asks about finding similar items
User needs to combine vector search with filters

Prerequisites Check

Claude should verify these prerequisites before proceeding:

✅ weaviate-local-setup completed - Python environment and dependencies installed
✅ weaviate-connection completed - Successfully connected to Weaviate
✅ weaviate-data-ingestion used - Collection has data to query
✅ Docker container running - Weaviate is accessible at localhost:8080

If any prerequisites are missing, Claude should:

Load the required prerequisite skill first
Guide the user through the setup
Then return to this skill

Prerequisites

Local Weaviate running in Docker (see weaviate-local-setup skill)
Active Weaviate connection (use weaviate-connection skill first)
Collection with data (use weaviate-data-ingestion skill to add data)
Python weaviate-client library installed

Query Types

1. Semantic Search (Vector Search)

Find objects semantically similar to your query:

import weaviate

# Assuming client is already connected
collection = client.collections.get("Articles")

# Search by meaning
response = collection.query.near_text(
    query="artificial intelligence and machine learning",
    limit=5
)

# Display results
for obj in response.objects:
    print(f"Title: {obj.properties['title']}")
    print(f"Content: {obj.properties['content'][:200]}...")
    print(f"Score: {obj.metadata.score}\n")

2. Search with Specific Properties

Return only the fields you need:

response = collection.query.near_text(
    query="vector databases",
    limit=5,
    return_properties=["title", "author", "publishDate"]
)

for obj in response.objects:
    print(f"{obj.properties['title']} by {obj.properties['author']}")

3. Keyword Search (BM25)

Traditional keyword-based search:

from weaviate.classes.query import QueryReference

response = collection.query.bm25(
    query="vector search",
    limit=5
)

for obj in response.objects:
    print(f"Title: {obj.properties['title']}")

4. Hybrid Search (Best of Both Worlds)

Combine semantic and keyword search:

response = collection.query.hybrid(
    query="machine learning applications",
    limit=5,
    alpha=0.5  # 0 = pure BM25, 1 = pure vector, 0.5 = balanced
)

for obj in response.objects:
    print(f"Title: {obj.properties['title']}")
    print(f"Score: {obj.metadata.score}\n")

5. Filter Results

Search with conditions:

from weaviate.classes.query import Filter

# Search with author filter
response = collection.query.near_text(
    query="AI advancements",
    limit=5,
    filters=Filter.by_property("author").equal("Jane Smith")
)

# Multiple filters
response = collection.query.near_text(
    query="technology trends",
    limit=10,
    filters=(
        Filter.by_property("author").equal("Jane Smith") &
        Filter.by_property("publishDate").greater_than("2024-01-01T00:00:00Z")
    )
)

# Filter by array contains
response = collection.query.near_text(
    query="programming",
    filters=Filter.by_property("tags").contains_any(["python", "javascript"])
)

6. Filter Operators

from weaviate.classes.query import Filter

# Equality
Filter.by_property("status").equal("published")

# Comparison
Filter.by_property("price").greater_than(100)
Filter.by_property("price").less_than(500)
Filter.by_property("price").greater_or_equal(100)
Filter.by_property("price").less_or_equal(500)

# String matching
Filter.by_property("title").like("*vector*")  # Contains "vector"

# Array operations
Filter.by_property("tags").contains_any(["ai", "ml"])
Filter.by_property("tags").contains_all(["python", "tutorial"])

# Combine filters
(Filter.by_property("price").greater_than(100) &
 Filter.by_property("category").equal("Electronics"))

# OR conditions
(Filter.by_property("author").equal("John") |
 Filter.by_property("author").equal("Jane"))

7. Search by Image (Multi-modal)

For collections with CLIP or multi2vec:

import base64

# Encode query image
with open("query_image.jpg", "rb") as f:
    query_image = base64.b64encode(f.read()).decode("utf-8")

collection = client.collections.get("ProductCatalog")

# Find similar images
response = collection.query.near_image(
    near_image=query_image,
    limit=5,
    return_properties=["name", "description", "price"]
)

for obj in response.objects:
    print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")

8. Search by Vector

If you have a pre-computed embedding:

# Your custom embedding
query_vector = [0.1, 0.2, 0.3, ...]  # 1536 dimensions for OpenAI

response = collection.query.near_vector(
    near_vector=query_vector,
    limit=5
)

9. Get Object by ID

Retrieve specific object:

# Get by UUID
obj = collection.query.fetch_object_by_id("uuid-here")

print(f"Title: {obj.properties['title']}")
print(f"Content: {obj.properties['content']}")

10. Fetch Multiple Objects

Get all objects or filter by property:

from weaviate.classes.query import Filter

# Get all objects (paginated)
response = collection.query.fetch_objects(limit=100)

# Get objects matching filter
response = collection.query.fetch_objects(
    filters=Filter.by_property("section").equal("Introduction"),
    limit=50
)

for obj in response.objects:
    print(obj.properties['title'])

RAG (Retrieval Augmented Generation)

Use Weaviate's generative module for Q&A:

Single Prompt RAG

# Collection must have generative module configured
collection = client.collections.get("TechnicalDocuments")

response = collection.generate.near_text(
    query="How do I configure HVAC systems?",
    single_prompt="Answer this question based on the context: {question}. Context: {content}",
    limit=3
)

# Get generated answer
print(f"Answer: {response.generated}")

# See source documents
for obj in response.objects:
    print(f"\nSource: {obj.properties['title']}")
    print(f"Content: {obj.properties['content'][:200]}...")

Grouped Task RAG

Generate one response using all results:

response = collection.generate.near_text(
    query="What are the best practices for fan selection?",
    grouped_task="Summarize the key recommendations from these documents about fan selection",
    limit=5
)

print(response.generated)

Custom RAG Implementation

If collection doesn't have generative module:

from openai import OpenAI

# Search Weaviate
weaviate_collection = weaviate_client.collections.get("TechnicalDocuments")
search_results = weaviate_collection.query.near_text(
    query="What is the friction loss for round elbows?",
    limit=5,
    return_properties=["content", "section", "page"]
)

# Build context
context = "\n\n".join([
    f"[{obj.properties['section']} - Page {obj.properties['page']}]\n{obj.properties['content']}"
    for obj in search_results.objects
])

# Call LLM
openai_client = OpenAI()
response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "You are a technical assistant. Answer questions based on the provided context."

weaviate-query-agent

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

xlsx

mem-search

weekly-digests

how-it-works

Recibe nuevas skills de Dados e Análise todos los lunes