wiki-ingest: Source Ingestion
Read the source. Write the wiki. Cross-reference everything. A single source typically touches 8-15 wiki pages.
Syntax standard: Write all Obsidian Markdown using proper Obsidian Flavored Markdown. Wikilinks as [[Note Name]], callouts as > [!type] Title, embeds as ![[file]], properties as YAML frontmatter. If the kepano/obsidian-skills plugin is installed, prefer its canonical obsidian-markdown skill for Obsidian syntax reference. Otherwise, follow the guidance in this skill.
Transport (v1.7+)
Before mutating any vault file, consult .vault-meta/transport.json (auto-created by bash scripts/detect-transport.sh). Use the preferred transport per the fallback chain:
- cli —
obsidian-cli write "$VAULT" "$NOTE" < content.md(orappend,property:set); seeskills/wiki-cli/SKILL.md - mcp-obsidian / mcpvault —
mcp__obsidian-vault__write_noteand friends; seeskills/wiki/references/mcp-setup.md - filesystem — Claude's
Write/Edittools with absolute vault-rooted paths (final floor; always works)
Full decision tree: wiki/references/transport-fallback.md.
Mode awareness (v1.8+)
Before creating any new wiki page, consult the vault's methodology mode via python3 scripts/wiki-mode.py route <type> "<name>". The router returns the vault-relative path where the page should be filed.
SRC_PATH=$(python3 scripts/wiki-mode.py route source "Karpathy 2025 LLM Wiki essay")
# generic: wiki/sources/Karpathy-2025-LLM-Wiki-essay.md
# lyt: wiki/notes/Karpathy-2025-LLM-Wiki-essay.md (also update relevant MOC)
# para: wiki/resources/incoming/Karpathy-2025-LLM-Wiki-essay.md
# zettelkasten: wiki/20260517123456-Karpathy-2025-LLM-Wiki-essay.md
ENT_PATH=$(python3 scripts/wiki-mode.py route entity "Andrej Karpathy")
CON_PATH=$(python3 scripts/wiki-mode.py route concept "Compounding Vault Pattern")
If .vault-meta/mode.json is absent, the router returns mode=generic paths (identical to v1.7 behavior). No special-casing needed in this skill.
Mode-specific follow-up:
- LYT: after filing the atomic note, update the relevant MOC (
wiki/mocs/<topic>-moc.md) to link the new note. If no MOC exists for the topic, create one usingskills/wiki-mode/templates/lyt/moc-template.md. - Zettelkasten: filename already includes the timestamp ID. Populate the
id:frontmatter field to match. - PARA: new ingests land in
wiki/resources/incoming/by default. Do NOT auto-guess the topic; leave in incoming/ for user review.
Concurrency (v1.7+)
Multi-writer is safe in v1.7. The latent corruption bug from v1.6 — where two parallel sub-agents writing to the same page could silently trample each other — is closed by per-file advisory locking. Every wiki page write MUST be preceded by wiki-lock acquire <path>.
# Acquire — blocks (returns 75 EX_TEMPFAIL) if another writer holds the lock
if bash scripts/wiki-lock.sh acquire wiki/concepts/Foo.md; then
# ... do the write via the §Transport-selected method ...
bash scripts/wiki-lock.sh release wiki/concepts/Foo.md
else
# rc=75: another writer is in flight. Retry once after 2s; if still held,
# log to wiki/log.md and skip this page rather than overwrite.
sleep 2
bash scripts/wiki-lock.sh acquire wiki/concepts/Foo.md && {
# write …
bash scripts/wiki-lock.sh release wiki/concepts/Foo.md
} || echo "skipped wiki/concepts/Foo.md (locked); logged to wiki/log.md"
fi
Properties:
- Per-file granularity. Locks key on
sha1(<vault-relative-path>); concurrent writes to DIFFERENT pages run in parallel. - Age-based staleness. Default
STALE_AFTER_SEC=60. A crashed holder unblocks in ≤60 seconds without manual intervention. Seescripts/wiki-lock.shheader for the full semantics. - Cross-process release. Release is
rm -f(no PID match required). Skill authors are trusted to release locks they acquire; cross-skill release is allowed by design (a janitor runningwiki-lock clear-stale --max-age 0is the canonical recovery path). - The PostToolUse hook now defers
git addif any locks are currently held, so the auto-commit doesn't fire mid-ingest and produce torn commits. Seehooks/hooks.json.
wiki-lock is unconditional in v1.7+ — there is no feature gate, no fallback. Skills that don't acquire locks are racing against any other writer. The script is in core, not opt-in.
Sub-agent rule from v1.6 — "Sub-agents MUST NOT call scripts/allocate-address.sh" — is preserved (orchestrator still backfills addresses to keep the counter monotonic). The NEW rule is: sub-agents MAY now write pages, but MUST acquire locks first. See agents/wiki-ingest.md.
Delta Tracking
Before ingesting any file, check .raw/.manifest.json to avoid re-processing unchanged sources.
# Check if manifest exists
[ -f .raw/.manifest.json ] && echo "exists" || echo "no manifest yet"
Manifest format (create if missing):
{
"sources": {
".raw/articles/article-slug-2026-04-08.md": {
"hash": "abc123",
"ingested_at": "2026-04-08",
"pages_created": ["wiki/sources/article-slug.md", "wiki/entities/Person.md"],
"pages_updated": ["wiki/index.md"]
}
}
}
Before ingesting a file:
- Compute a hash:
md5sum [file] | cut -d' ' -f1(orsha256sumon Linux). - Check if the path exists in
.manifest.jsonwith the same hash. - If hash matches, skip. Report: "Already ingested (unchanged). Use
forceto re-ingest." - If missing or hash differs, proceed with ingest.
After ingesting a file:
- Record
{hash, ingested_at, pages_created, pages_updated}in.manifest.json. - Write the updated manifest back.
Skip delta checking if the user says "force ingest" or "re-ingest".
URL Ingestion
Trigger: user passes a URL starting with https://.
Steps:
- Fetch the page using WebFetch.
- Clean (optional): if
defuddleis available (which defuddle 2>/dev/null), rundefuddle [url]to strip ads, nav, and clutter. Typically saves 40-60% tokens. Fall back to raw WebFetch output if not installed. - Derive slug from the URL path (last segment, lowercased, spaces→hyphens, strip query strings).
- Save to
.raw/articles/[slug]-[YYYY-MM-DD].mdwith a frontmatter header:--- source_url: [url] fetched: [YYYY-MM-DD] --- - Proceed with Single Source Ingest starting at step 2 (file is now in
.raw/).
Image / Vision Ingestion
Trigger: user passes an image file path (.png, .jpg, .jpeg, .gif, .webp, .svg, .avif).
Steps:
- Read the image file using the Read tool. Claude can process images natively.
- Describe the image contents: extract all text (OCR), identify key concepts, entities, diagrams, and data visible in the image.
- Save the description to
.raw/images/[slug]-[YYYY-MM-DD].md:--- source_type: image original_file: [original path] fetched: YYYY-MM-DD --- # Image: [slug] [Full description of image contents, transcribed text, entities visible, etc.] - Copy the image to
_attachments/images/[slug].[ext]if it's not already in the vault. - Proceed with Single Source Ingest on the saved description file.
Use cases: whiteboard photos, screenshots, diagrams, infographics, document scans.
Single Source Ingest
Trigger: user drops a file into .raw/ or pastes content.
Steps:
- Read the source completely. Do not skim.
- Discuss key takeaways with the user. Ask: "What should I emphasize? How granular?" Skip this if the user says "just ingest it."
- Create source summary in
wiki/sources/. Use the source frontmatter schema fromreferences/frontmatter.md. Assign an address per the Address Assignment section below. - Create or update e