Semantic Search Skill
Orchestrator for Semantic Code Intelligence via Agent Delegation
This skill orchestrates two specialized agents for semantic search operations. It provides bash scripts that import Python modules from the claude-context-local library (NOT an MCP server - no server process runs, just Python imports via PYTHONPATH). Unlike traditional text-based search (Grep) or pattern matching (Glob), semantic search understands the meaning of content, finding functionally similar text even when using different wording, variable names, or patterns.
The skill uses the library's venv Python interpreter to import merkle, chunking, and embedding modules, enabling semantic search, indexing, and similarity finding across any text content (code, docs, markdown, configs).
🎬 Orchestration Instructions
When this skill is active, you MUST spawn the appropriate agent via Task tool.
This skill uses a 2-agent architecture for token optimization:
- semantic-search-reader: Handles READ operations (search, find-similar, list-projects)
- semantic-search-indexer: Handles WRITE operations (index, incremental-reindex, status)
Decision Logic: Which Agent to Spawn?
| User Request Contains | Operation Type | Agent to Spawn |
|---|---|---|
| "find X", "search for Y", "where is Z" | search | semantic-search-reader |
| "find similar to...", "similar chunks" | find-similar | semantic-search-reader |
| "what projects", "list indexed", "show projects" | list-projects | semantic-search-reader |
| "index this", "create index", "full reindex" | index | semantic-search-indexer |
| "incremental reindex", "auto reindex", "update index" | incremental-reindex | semantic-search-indexer |
| "check index", "index status", "is it indexed" | status | semantic-search-indexer |
Agent Spawn Examples
Example 1: Search Operation (semantic-search-reader)
Task(
subagent_type="semantic-search-reader",
description="Search project semantically",
prompt="""You are the semantic-search-reader agent.
Operation: search
Query: "user authentication logic"
K: 10
Project: /path/to/project
Execute the search operation using scripts/search and return interpreted results with explanations."""
)
Example 2: Index Operation (semantic-search-indexer)
Task(
subagent_type="semantic-search-indexer",
description="Index project for semantic search",
prompt="""You are the semantic-search-indexer agent.
Operation: index
Directory: /path/to/project
Full: true
Execute the indexing operation using scripts/incremental-reindex and return interpreted results with statistics."""
)
Example 3: Incremental Reindex Operation (semantic-search-indexer)
Task(
subagent_type="semantic-search-indexer",
description="Incremental reindex with change detection",
prompt="""You are the semantic-search-indexer agent.
Operation: incremental-reindex
Directory: /path/to/project
Max Age: 360 # minutes (6 hours)
Execute smart auto-reindexing using scripts/incremental-reindex.
This will detect changed files using Merkle tree, then auto-fallback to full reindex.
Return statistics showing total files indexed and total chunks."""
)
Example 4: Find Similar (semantic-search-reader)
Task(
subagent_type="semantic-search-reader",
description="Find similar content chunks",
prompt="""You are the semantic-search-reader agent.
Operation: find-similar
Chunk ID: "src/auth.py:45-67:function:authenticate"
K: 5
Project: /path/to/project
Execute the find-similar operation using scripts/find-similar and return interpreted results."""
)
Example 5: Status Check (semantic-search-indexer)
Task(
subagent_type="semantic-search-indexer",
description="Check semantic index status",
prompt="""You are the semantic-search-indexer agent.
Operation: status
Project: /path/to/project
Execute the status operation using scripts/status and return interpreted results with statistics."""
)
Important Notes
- NEVER run bash scripts directly - always spawn the appropriate agent
- Agents handle error interpretation - they convert JSON errors to natural language
- Token optimization: Agent execution happens in separate context (saves YOUR tokens)
- Wait for agent completion - agents return summarized results, not raw JSON
🎯 When to Use This Skill
✅ Use Semantic Search When:
1. Exploring Unfamiliar Projects
- "How does this codebase handle user authentication?"
- "Where is database connection pooling implemented?"
- "Show me error handling patterns in this project"
- "Find documentation about the architecture"
2. Finding Functionality Without Keywords
- Looking for implementations but don't know the exact function names
- Need to find code that "does X" without knowing how it's named
- Searching across multiple languages/frameworks with different conventions
3. Discovering Similar Code
- "Find code similar to this payment processing logic"
- "Are there other implementations of rate limiting?"
- "What other modules use this pattern?"
4. Cross-Reference Discovery
- Finding all authentication methods in a polyglot codebase
- Locating retry logic across different services
- Identifying validation patterns in various modules
5. Searching Documentation & Configuration
- "Find documentation explaining the deployment process"
- "Locate configuration examples for database connections"
- "Search for troubleshooting guides or setup instructions"
- "Find ADRs (Architecture Decision Records) about API design"
- "Locate markdown files about testing strategies"
6. Cross-Format Content Discovery
- "Find all references to environment variables (across code, docs, configs)"
- "Search for rate limiting mentions in any format"
- "Locate authentication documentation and implementation together"
- "Find deployment guides and deployment scripts"
❌ Do NOT Use Semantic Search When:
Use Grep instead for:
- Exact string matching:
"import React" - Known variable/function names:
"getUserById" - Regex patterns:
"function.*export" - File content search with known keywords
Use Glob instead for:
- Finding files by name pattern:
"**/*.test.js" - Locating configuration files:
"**/config.yml" - File system navigation:
"src/components/**/*.tsx"
Use Read instead for:
- Reading specific known files
- Examining file contents after Grep/Glob narrowed results
- Sequential file analysis
📋 Prerequisites
Required: Python Library Dependency
IMPORTANT: This is NOT an MCP server - it's a Python library dependency. No server process runs. Our scripts import Python modules via PYTHONPATH.
This skill requires the claude-context-local Python library for semantic indexing:
# Clone Python library to standard location (5 minutes)
git clone https://github.com/FarhanAliRaza/claude-context-local.git ~/.local/share/claude-context-local
# Set up Python virtual environment and install dependencies
cd ~/.local/share/claude-context-local
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .
What this installs:
- Merkle tree change detection (80KB)
- Multi-language code chunking (192KB) - supports 15+ languages
- Embedding generation (76KB) - wraps sentence-transformers
- Dependencies: faiss-cpu, sentence-transformers, tree-sitter
Installation location:
- macOS/Linux:
~/.local/share/claude-context-local - Windows:
%LOCALAPPDATA%\claude-context-local
License: claude-context-local is GPL-3.0. We import via PYTHONPATH (dynamic linking), which preserves our Apache 2.0 license. See docs/architecture/MCP-DEPENDENCY-STRATEGY.md for details.
Index Creation
This skill provides an index script that creates and updates the semantic content index. The index is stored in `~/.cla