Paths: File paths (
references/,../ln-*) are relative to this skill directory.
Documentation Fact-Checker (L3 Worker)
Type: L3 Worker
Specialized worker that extracts verifiable claims from documentation and validates each against the actual codebase.
Purpose & Scope
- Prioritize canonical and high-claim docs, then extract verifiable claims from markdown documentation
- Verify each claim against codebase (Grep/Glob/Read/Bash)
- Detect cross-document contradictions (same fact stated differently)
- Includes
docs/reference/,docs/tasks/,tests/in scope - Single invocation (not per-document) -> cross-doc checks require global view
- Does NOT check scope alignment or structural quality
Inputs
MANDATORY READ: Load references/audit_worker_core_contract.md, references/docs_quality_contract.md, and references/markdown_read_protocol.md.
Optional rule catalog: load references/docs_quality_rules.json only when exact rule IDs, path matrices, or allowlisted placeholder exceptions are needed.
Tool policy: follow host AGENTS.md MCP preferences; load references/mcp_tool_preferences.md and references/mcp_integration_patterns.md only when host policy is absent or MCP behavior is unclear.
Receives contextStore with: tech_stack, project_root, output_dir.
Workflow
Phase 1: Parse Context
Extract tech stack, project root, output_dir from contextStore.
Phase 2: Discover Documents
Glob markdown docs in project. Exclude:
node_modules/,.git/,dist/,build/docs/project/.audit/(audit output, not project docs)CHANGELOG.md(historical by design)
If docs/project/.context/doc_registry.json exists:
- load it first
- prioritize
doc_role=canonical - prioritize files with dense claim types (paths, endpoints, versions, commands)
- de-prioritize navigation hubs unless contradictions point back to them
Phase 3: Extract Claims (Layer 1)
Detection policy: use two-layer detection (candidate scan, then context verification); load references/two_layer_detection.md only when the verification method is ambiguous.
For each prioritized document, use section-first reads to extract verifiable claims using Grep/regex patterns.
For code files referenced by docs, use outline() and discovery-first read_file() before built-in reads. Only request edit_ready=true, verbosity="full" if verification turns into a follow-up edit. Use hex-graph only when entity identity or reference resolution remains ambiguous after direct manifest/file checks.
MANDATORY READ: Load references/claim_extraction_rules.md for detailed extraction patterns per claim type.
9 claim types:
| # | Claim Type | What to Extract | Extraction Pattern |
|---|---|---|---|
| 1 | File paths | Paths to source files, dirs, configs | Backtick paths, link targets matching src/, lib/, app/, docs/, config/, tests/ |
| 2 | Versions | Package/tool/image versions | Semver patterns near dependency/package/image names |
| 3 | Counts/Statistics | Numeric claims about codebase | `\d+ (modules |
| 4 | API endpoints | HTTP method + path | `(GET |
| 5 | Config keys/env vars | Environment variables, config keys | [A-Z][A-Z_]{2,} in config context, process.env., os.environ |
| 6 | CLI commands | Shell commands | npm run, python, docker, make in backtick blocks |
| 7 | Function/class names | Code entity references | CamelCase/snake_case in backticks or code context |
| 8 | Line number refs | file:line patterns | [\w/.]+:\d+ patterns |
| 9 | Docker/infra claims | Image tags, ports, service names | Image names with tags, port mappings in docker context |
Output per claim: {doc_path, line, claim_type, claim_value, raw_context}.
Phase 4: Verify Claims (Layer 2)
For each extracted claim, verify against codebase:
| Claim Type | Verification Method | Finding Type |
|---|---|---|
| File paths | Glob or ls for existence | PATH_NOT_FOUND |
| Versions | Grep package files (package.json, requirements.txt, docker-compose.yml), compare | VERSION_MISMATCH |
| Counts | Glob/Grep to count actual entities, compare with claimed number | COUNT_MISMATCH |
| API endpoints | Grep route/controller definitions | ENDPOINT_NOT_FOUND |
| Config keys | Grep in source for actual usage | CONFIG_NOT_FOUND |
| CLI commands | Check package.json scripts, Makefile targets, binary existence | COMMAND_NOT_FOUND |
| Function/class | Grep in source for definition | ENTITY_NOT_FOUND |
| Line numbers | Read file at line, check content matches claimed context | LINE_MISMATCH |
| Docker/infra | Grep docker-compose.yml for image tags, ports | INFRA_MISMATCH |
False positive filtering (Layer 2 reasoning):
- Template placeholders (
{placeholder},YOUR_*,<project>,xxx) -> skip - Example/hypothetical paths (preceded by "e.g.", "for example", "such as") -> skip
- Future-tense claims ("will add", "planned", "TODO") -> skip or LOW
- Conditional claims ("if using X, configure Y") -> verify only if X detected in tech_stack
- External service paths (URLs, external repos) -> skip
- Paths in SCOPE/comment HTML blocks describing other projects -> skip
.env.examplevalues -> skip (expected to differ from actual)
Phase 5: Cross-Document Consistency
Compare extracted claims across documents to find contradictions:
| Check | Method | Finding Type |
|---|---|---|
| Same path, different locations | Group file path claims, check if all point to same real path | CROSS_DOC_PATH_CONFLICT |
| Same entity, different version | Group version claims by entity name, compare values | CROSS_DOC_VERSION_CONFLICT |
| Same metric, different count | Group count claims by subject, compare values | CROSS_DOC_COUNT_CONFLICT |
| Endpoint in spec but not in guide | Compare endpoint claims across api_spec.md vs guides/runbook | CROSS_DOC_ENDPOINT_GAP |
Algorithm:
claim_index = {} # key: normalized(claim_type + entity), value: [{doc, line, value}]
FOR claim IN all_verified_claims WHERE claim.verified == true:
key = normalize(claim.claim_type, claim.entity_name)
claim_index[key].append({doc: claim.doc_path, line: claim.line, value: claim.claim_value})
FOR key, entries IN claim_index:
unique_values = set(entry.value for entry in entries)
IF len(unique_values) > 1:
CREATE finding(type=CROSS_DOC_*_CONFLICT, severity=HIGH,
location=entries[0].doc + ":" + entries[0].line,
issue="'" + key + "' stated as '" + val1 + "' in " + doc1 + " but '" + val2 + "' in " + doc2)
Phase 6: Score & Report
MANDATORY READ: Load references/audit_scoring.md.
Calculate score using penalty formula. Write report.
Audit Categories (for Checks table)
| ID | Check | What It Covers |
|---|---|---|
path_claims | File/Directory Paths | All path references verified against filesystem |
version_claims | Version Numbers | Package, tool, image versions against manifests |
count_claims | Counts & Statistics | Numeric assertions against actual counts |
endpoint_claims | API Endpoints | Route definitions against controllers/routers |
config_claims | Config & Env Vars | Environment variables, config keys against source |
command_claims | CLI Commands | Scripts, commands against package.json/Makefile |
entity_claims | Code Entity Names | Functions, classes against source definitions |
line_ref_claims | Line Number References | file:line against actual file content |
cross_doc | Cross-Document Consistency | Same facts across documents agree |
Severity Mapping
| Issue Type | Severity | Rationale |
|---|---|---|
| PATH_NOT_FOUND (critical file: AGENTS.md, CLAUDE.md, runbook, api_spec) | CRITICAL | Setup/onboarding |