Semantic Search Skill

Orchestrator for Semantic Code Intelligence via Agent Delegation

This skill orchestrates two specialized agents for semantic search operations. It provides bash scripts that import Python modules from the claude-context-local library (NOT an MCP server - no server process runs, just Python imports via PYTHONPATH). Unlike traditional text-based search (Grep) or pattern matching (Glob), semantic search understands the meaning of content, finding functionally similar text even when using different wording, variable names, or patterns.

The skill uses the library's venv Python interpreter to import merkle, chunking, and embedding modules, enabling semantic search, indexing, and similarity finding across any text content (code, docs, markdown, configs).

🎬 Orchestration Instructions

When this skill is active, you MUST spawn the appropriate agent via Task tool.

This skill uses a 2-agent architecture for token optimization:

semantic-search-reader: Handles READ operations (search, find-similar, list-projects)
semantic-search-indexer: Handles WRITE operations (index, incremental-reindex, status)

Decision Logic: Which Agent to Spawn?

User Request Contains	Operation Type	Agent to Spawn
"find X", "search for Y", "where is Z"	search	semantic-search-reader
"find similar to...", "similar chunks"	find-similar	semantic-search-reader
"what projects", "list indexed", "show projects"	list-projects	semantic-search-reader
"index this", "create index", "full reindex"	index	semantic-search-indexer
"incremental reindex", "auto reindex", "update index"	incremental-reindex	semantic-search-indexer
"check index", "index status", "is it indexed"	status	semantic-search-indexer

Agent Spawn Examples

Example 1: Search Operation (semantic-search-reader)

Task(
    subagent_type="semantic-search-reader",
    description="Search project semantically",
    prompt="""You are the semantic-search-reader agent.

Operation: search
Query: "user authentication logic"
K: 10
Project: /path/to/project

Execute the search operation using scripts/search and return interpreted results with explanations."""
)

Example 2: Index Operation (semantic-search-indexer)

Task(
    subagent_type="semantic-search-indexer",
    description="Index project for semantic search",
    prompt="""You are the semantic-search-indexer agent.

Operation: index
Directory: /path/to/project
Full: true

Execute the indexing operation using scripts/incremental-reindex and return interpreted results with statistics."""
)

Example 3: Incremental Reindex Operation (semantic-search-indexer)

Task(
    subagent_type="semantic-search-indexer",
    description="Incremental reindex with change detection",
    prompt="""You are the semantic-search-indexer agent.

Operation: incremental-reindex
Directory: /path/to/project
Max Age: 360  # minutes (6 hours)

Execute smart auto-reindexing using scripts/incremental-reindex.
This will detect changed files using Merkle tree, then auto-fallback to full reindex.
Return statistics showing total files indexed and total chunks."""
)

Example 4: Find Similar (semantic-search-reader)

Task(
    subagent_type="semantic-search-reader",
    description="Find similar content chunks",
    prompt="""You are the semantic-search-reader agent.

Operation: find-similar
Chunk ID: "src/auth.py:45-67:function:authenticate"
K: 5
Project: /path/to/project

Execute the find-similar operation using scripts/find-similar and return interpreted results."""
)

Example 5: Status Check (semantic-search-indexer)

Task(
    subagent_type="semantic-search-indexer",
    description="Check semantic index status",
    prompt="""You are the semantic-search-indexer agent.

Operation: status
Project: /path/to/project

Execute the status operation using scripts/status and return interpreted results with statistics."""
)

Important Notes

NEVER run bash scripts directly - always spawn the appropriate agent
Agents handle error interpretation - they convert JSON errors to natural language
Token optimization: Agent execution happens in separate context (saves YOUR tokens)
Wait for agent completion - agents return summarized results, not raw JSON

🎯 When to Use This Skill

✅ Use Semantic Search When:

1. Exploring Unfamiliar Projects

"How does this codebase handle user authentication?"
"Where is database connection pooling implemented?"
"Show me error handling patterns in this project"
"Find documentation about the architecture"

2. Finding Functionality Without Keywords

Looking for implementations but don't know the exact function names
Need to find code that "does X" without knowing how it's named
Searching across multiple languages/frameworks with different conventions

3. Discovering Similar Code

"Find code similar to this payment processing logic"
"Are there other implementations of rate limiting?"
"What other modules use this pattern?"

4. Cross-Reference Discovery

Finding all authentication methods in a polyglot codebase
Locating retry logic across different services
Identifying validation patterns in various modules

5. Searching Documentation & Configuration

"Find documentation explaining the deployment process"
"Locate configuration examples for database connections"
"Search for troubleshooting guides or setup instructions"
"Find ADRs (Architecture Decision Records) about API design"
"Locate markdown files about testing strategies"

6. Cross-Format Content Discovery

"Find all references to environment variables (across code, docs, configs)"
"Search for rate limiting mentions in any format"
"Locate authentication documentation and implementation together"
"Find deployment guides and deployment scripts"

❌ Do NOT Use Semantic Search When:

Use Grep instead for:

Exact string matching: "import React"
Known variable/function names: "getUserById"
Regex patterns: "function.*export"
File content search with known keywords

Use Glob instead for:

Finding files by name pattern: "**/*.test.js"
Locating configuration files: "**/config.yml"
File system navigation: "src/components/**/*.tsx"

Use Read instead for:

Reading specific known files
Examining file contents after Grep/Glob narrowed results
Sequential file analysis

📋 Prerequisites

Required: Python Library Dependency

IMPORTANT: This is NOT an MCP server - it's a Python library dependency. No server process runs. Our scripts import Python modules via PYTHONPATH.

This skill requires the claude-context-local Python library for semantic indexing:

# Clone Python library to standard location (5 minutes)
git clone https://github.com/FarhanAliRaza/claude-context-local.git ~/.local/share/claude-context-local

# Set up Python virtual environment and install dependencies
cd ~/.local/share/claude-context-local
python3 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -e .

What this installs:

Merkle tree change detection (80KB)
Multi-language code chunking (192KB) - supports 15+ languages
Embedding generation (76KB) - wraps sentence-transformers
Dependencies: faiss-cpu, sentence-transformers, tree-sitter

Installation location:

macOS/Linux: ~/.local/share/claude-context-local
Windows: %LOCALAPPDATA%\claude-context-local

License: claude-context-local is GPL-3.0. We import via PYTHONPATH (dynamic linking), which preserves our Apache 2.0 license. See docs/architecture/MCP-DEPENDENCY-STRATEGY.md for details.

Index Creation

This skill provides an index script that creates and updates the semantic content index. The index is stored in `~/.cla

semantic-search

How to add

Drop this on your repo README

Related skills

understand-dashboard

understand-chat

understand-domain

dev-browser

Get new Pesquisa e Web skills every Monday