Vault Sync Skill
Project-instruction file resolution:
CLAUDE.mdandAGENTS.md(Codex CLI) are transparent aliases — see skills/_shared/instruction-file-resolution.md. The vault-marker check below treats either file as a valid marker (when it carries## Session Config+vault-sync:); references toCLAUDE.mdresolve via the SSOT precedence rule.
Status
STATUS: PHASE 1 IMPLEMENTED (2026-04-13). Session-End hard gate (section 3.1) operational. Phase 2 (wave-executor incremental, 3.2) and Phase 3 (evolve advisory, 3.3) not yet implemented.
Implementation
Phase 1 ships a self-contained validator that reads every .md file under VAULT_DIR, parses YAML frontmatter, validates against the canonical vaultFrontmatterSchema, and flags dangling wiki-links as warnings.
Files
validator.mjs— Node.js ESM validator. Useszod+yamlnpm packages. ReadsVAULT_DIR(env or default cwd), walks the tree, skippingnode_modules/,.git/,.obsidian/,90-archive/. For each.md: parses frontmatter, validates against the inline Zod schema, extracts[[wiki-links]], verifies each target resolves. Emits JSON report on stdout.validator.sh— Thin POSIX wrapper. ResolvesVAULT_DIRfrom arg 1 or env, self-bootstraps deps viapnpm install --silenton first run, execs the Node validator. Session-end and other callers use this entry point.package.json— Declareszod(^3.24.0, matching projects-baseline) andyaml(^2.5.0) as deps.pnpm-lock.yamlis committed;node_modules/is gitignored.tests/validator.bats— 16 BATS cases covering clean vaults, broken frontmatter, missing required fields, dangling links, no-vault skipping, README-style files, nested directories, and archive/obsidian exclusion.tests/fixtures/— Seven fixture vaults matching each test scenario.
Schema source
The inline Zod schema is vendored from the canonical source at projects-baseline/packages/zod-schemas/src/vault-frontmatter.ts. The skill is intentionally self-contained (no monorepo workspace dependency), so the schema is duplicated with a header comment pointing at the SSOT. Drift is to be caught by a future smoke test that imports the canonical schema and diffs the shape — NOT YET IMPLEMENTED. Until that test exists, any change to the canonical schema must be mirrored here in the same commit.
How session-end invokes it
VAULT_DIR=/path/to/vault bash ~/Projects/session-orchestrator/skills/vault-sync/validator.sh
- Exit
0— vault valid (or skipped because no vault exists / no .md files). Warnings may still be present in the JSON report. - Exit
1— one or more validation errors. Session-end surfaces them in the quality gate report and refuses to close. - Exit
2— invalid invocation or infrastructure error. Two cases: (a)VAULT_DIRis not set andcwddoes not look like a Meta-Vault (no_meta/, no.obsidian/, noCLAUDE.mdorAGENTS.mdwith## Session Config+vault-sync:block) — actionable error printed to stderr; (b) infrastructure error (missingnode, missingvalidator.mjs, cannot bootstrap deps). In both cases no JSON is emitted to stdout.
JSON output shape (stdout):
{
"status": "ok|invalid|skipped",
"vault_dir": "...",
"files_checked": N,
"files_skipped_no_frontmatter": N,
"errors": [{"file": "...", "path": "frontmatter.id", "message": "..."}],
"warnings": [{"file": "...", "type": "dangling-wiki-link", "message": "..."}]
}
Opt-in --check-expires flag downgrades expired notes to warnings; default off (Phase 1 leaves freshness for the Phase 3 evolve advisory).
CLI Flags
The validator (both validator.mjs and the validator.sh wrapper) accepts:
--mode <hard|warn|off>— gate severity.hard(default) exits 1 on any frontmatter/schema error.warnexits 0 but still populates theerrorsarray in the JSON output so callers can surface them as warnings.offshort-circuits tostatus: "skipped-mode-off"— useful during onboarding when the gate is enabled but the vault is not yet clean.--exclude <glob>— repeatable. Glob patterns (relative toVAULT_DIR, POSIX-style forward slashes) matching files to skip. Supports**(any number of segments),*(any chars except/), and?(single char except/). Excluded files are counted inexcluded_countand contribute nothing toerrors/warnings. Example:--exclude "**/_MOC.md" --exclude "**/README.md".
Bare-invocation config loading (#329)
On every invocation — including bare validator.sh runs with no --exclude flags — the validator unconditionally reads vault-sync.exclude from <VAULT_DIR>/CLAUDE.md (or AGENTS.md on Codex CLI repos) and seeds the exclusion list with those globs first. CLI --exclude flags are additive: they extend the config-loaded list rather than replacing it. If CLAUDE.md is missing, unparseable, or has no vault-sync.exclude: block, the validator silently falls back to an empty config list and proceeds with CLI-only excludes. Required env: VAULT_DIR (resolved against cwd if absent).
--check-expires— flag expired notes (expires:date in the past) as warnings. Default off.
Environment variables:
VAULT_DIR— directory to scan. Defaults to$PWD. Can also be passed as the first positional argument tovalidator.sh.
Example invocation:
VAULT_DIR=~/Projects/vault bash validator.sh \
--mode warn \
--exclude "**/_MOC.md" \
--exclude "**/_overview.md" \
--exclude "**/README.md"
The JSON output always includes the mode and excluded_count fields when the validator runs past the mode-off / no-vault short-circuits.
Purpose
A "project vault" is a markdown-based knowledge base living under vault/ at the project root. Each file carries strict YAML frontmatter (id, title, tags, status, created, expires, sources) and uses wiki-style links to cross-reference peer notes. The vault is consumed by two audiences: humans browsing the knowledge base, and Sophie-style RAG agents that embed and retrieve notes during chat. Because both audiences depend on the same content, drift is expensive: a stale status: verified note with a dead source URL quietly poisons retrieval results, and a broken wiki-link breaks both navigation and graph traversal.
Automated validation is therefore mandatory, not optional. The vault needs four kinds of checks: frontmatter schema conformance, wiki-link integrity, source whitelist enforcement (especially for regulated content like austrian-law that must cite only approved government URLs), and freshness (expires date in the past). Session-orchestrator is the right home for the in-session layer because every project with a vault will eventually want this, and session lifecycle hooks (wave boundaries, session end, evolve) are exactly the points where drift becomes visible.
The reference architecture is 3 layers:
- Layer A: local git hooks (pre-commit, pre-push) -- IMPLEMENTED in reference project. Fast, fail-early, blocks bad commits.
- Layer B: session-orchestrator:vault-sync skill (THIS SPEC) -- PENDING. Continuous freshness inside normal session flow.
- Layer C: remote CI job -- IMPLEMENTED in reference project's
.gitlab-ci.yml. Final gate, catches anything the other two miss.
Layer B is the continuous freshness layer. Its job is to run inside normal session flow without requiring developers to remember to validate. If Layers A and C are the bookends, Layer B is the spine.
Invocation Points
3.1 Session-End Hard Gate
- Trigger: called by
session-orchestrator:session-endskill as part of Phase 1 (quality gates), alongside typecheck / lint / test. - Behavior: full validation run over the entire vault. No incremental mode here -- a clean session close must prove the whole vault is valid.
- Error handling: validation errors block the session close. The session-end skill surfaces them in the quality g