Wiki Enrich: Fill Paper TODO Sections (Karpathy LLM-Wiki)

Target: $ARGUMENTS

Why this skill exists

ingest_paper (called by /research-lit, /arxiv, /alphaxiv, /deepxiv, /semantic-scholar, /exa-search) only renders the per-paper scaffold — frontmatter + abstract + 10 fillable _TODO._ placeholder sections (plus two protected sections: ## Connections is graph-summary and ## Abstract (original) is auto-populated when --arxiv-id is given). No downstream skill in ARIS fills those 10 sections; the wiki sits as TODO until someone reads each paper.

This contradicts the Karpathy LLM-wiki design (https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f):

"You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. … The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. … LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass."

/wiki-enrich is the missing back half of ingest_paper: it reads each scaffolded paper page, fetches paper content from external sources via a graceful fallback chain (see Phase 2.3 for the full 5-source chain), and rewrites the 10 fillable TODO sections into 1-3 sentence prose summaries.

Constants

WIKI_ROOT = research-wiki/ — Resolved relative to git root. Skill hard-fails if not a directory.
TARGET_DEFAULT = missing — When no target is given, enrich only papers with ≥1 TODO section. Other targets: <slug> (one paper) or all (every paper, even ones already enriched — usually combined with --force to overwrite).
SOURCE_DEFAULT = auto — Fetch order: alphaxiv overview → alphaxiv abs → deepxiv brief → arXiv API abstract → page abstract fallback. First non-empty wins (full chain documented in Phase 2.3 table). Override with --source to pin one source.
MAX_PAPERS = 20 — Hard cap per invocation; LLMs touch many files but token budgets are real. Override with --max N.
FORCE = false — When false (default), skip sections that already have non-TODO content. When true, overwrite every fillable section, but never touch the two protected sections: ## Connections (auto-generated from edges.jsonl) and ## Abstract (original) (immutable arXiv-fetched source data).
SECTIONS_TO_FILL — 10 fillable sections + 2 protected. ingest_paper (research_wiki.py:436-473) scaffolds 11 section headers unconditionally and a 12th — ## Abstract (original) — only when arXiv returns an abstract for the given --arxiv-id (research_wiki.py:469-473). Of these, 10 carry a _TODO._ (or _TODO: fill in after reading._) marker and need filling. The other 2 — ## Connections (position 10 in the enumeration below) and ## Abstract (original) (position 12, conditional) — are protected by construction: Connections is auto-generated from graph/edges.jsonl, Abstract (original) is immutable source data from the arXiv API. This skill writes to the 10, never the 2.
1. One-line thesis (marker: _TODO: fill in after reading._)
2. Problem / Gap (marker: _TODO._)
3. Method (marker: _TODO._)
4. Key Results (marker: _TODO._)
5. Assumptions (marker: _TODO._)
6. Limitations / Failure Modes (marker: _TODO._)
7. Reusable Ingredients (marker: _TODO._)
8. Open Questions (marker: _TODO._)
9. Claims (marker: _TODO._) — fill with _No claims tracked yet._ if no claim: edges point to this paper; otherwise list them.
10. Connections — NEVER edit (auto-generated from graph/edges.jsonl).
11. Relevance to This Project (marker: _TODO._) — use RESEARCH_BRIEF.md, AGENTS.md (or legacy CLAUDE.md), or gap_map.md for project context. If no project context exists, leave as TODO and report it.
12. Abstract (original) — leave alone (already populated by ingest_paper when --arxiv-id was used).

💡 Examples:

/wiki-enrich — enrich every paper with ≥1 TODO section (most common usage)

/wiki-enrich vllm — enrich a single paper by slug

/wiki-enrich all --force — rewrite every paper from scratch (use when you've adopted a new style)

/wiki-enrich --source alphaxiv --max 5 — only use alphaxiv, only do 5 papers

/wiki-enrich missing --max 50 — bigger batch (watch token budget)

Pre-flight

Resolve $WIKI_ROOT and $WIKI_SCRIPT (canonical chain — see shared-references/wiki-helper-resolution.md):

cd "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" || exit 1
[ -d research-wiki/ ] || { echo "ERROR: research-wiki/ not found. Run /research-wiki init first." >&2; exit 1; }

ARIS_REPO="${ARIS_REPO:-$(awk -F'\t' '$1=="repo_root"{print $2; exit}' .aris/installed-skills.txt 2>/dev/null)}"
WIKI_SCRIPT=".aris/tools/research_wiki.py"
[ -f "$WIKI_SCRIPT" ] || WIKI_SCRIPT="tools/research_wiki.py"
[ -f "$WIKI_SCRIPT" ] || { [ -n "${ARIS_REPO:-}" ] && WIKI_SCRIPT="$ARIS_REPO/tools/research_wiki.py"; }
[ -f "$WIKI_SCRIPT" ] || { echo "ERROR: research_wiki.py not found." >&2; exit 1; }

If either fails, hard-fail — this skill manipulates wiki state and must not run blind.

Workflow

Phase 1: Parse target + discover candidates

Parse $ARGUMENTS for the first positional (target) and flags (--source, --force, --max).

Build the candidate paper list:

case "$TARGET" in
  all)
    PAPERS=( research-wiki/papers/*.md )
    ;;
  missing|"")
    # only papers with at least one TODO marker line
    PAPERS=( $(grep -lE "^_TODO(\._?|: fill in after reading\._?)$" research-wiki/papers/*.md 2>/dev/null) )
    ;;
  *)
    P="research-wiki/papers/${TARGET}.md"
    [ -f "$P" ] || { echo "ERROR: paper not found: $P" >&2; exit 1; }
    PAPERS=( "$P" )
    ;;
esac
echo "Candidate papers: ${#PAPERS[@]} (cap ${MAX_PAPERS})"
PAPERS=( "${PAPERS[@]:0:${MAX_PAPERS}}" )

If the candidate list is empty, print "✓ Nothing to enrich." and exit 0. Do not error.

Phase 2: For each paper — read, fetch, fill

Iterate one paper at a time. For each $PAPER in $PAPERS:

Step 2.1 — Read the page and project context. Use the Read tool on the full paper file. Extract from the YAML frontmatter:

node_id (e.g. paper:vllm) — slug = part after paper:
arxiv from external_ids.arxiv — empty string if absent
title
existing ## Abstract (original) blockquote (if present) — fallback content source

Additionally, on the FIRST paper of the batch (cache for the rest), read project-context files needed for the Claims and Relevance to This Project sections:

research-wiki/graph/edges.jsonl — scan for claim: edges pointing to the current paper's node_id
RESEARCH_BRIEF.md (project root) — if present, source for project goals
AGENTS.md (project root, Codex CLI primary) or legacy CLAUDE.md — if present, fallback for project context
research-wiki/gap_map.md — if non-empty, source for gap framing

If none of the project-context files exist, the Relevance to This Project section will be filled with the literal "context not yet set" line (see Step 2.4 table).

Step 2.2 — Identify which sections are TODO.

Match each section header against its marker:

A header followed by exactly _TODO._ → fill
A header followed by _TODO: fill in after reading._ → fill (One-line thesis)
A header followed by any other content → skip (unless --force)
## Connections → always skip (auto-generated)
## Abstract (original) → always skip (immutable source data)

If no fillable sections remain, log "skip: <slug> (already enriched)" and continue.

Step 2.3 — Fetch source content.

The fetch chain runs in order until one returns usable content (>200 chars of text):

Order	Source	How
1	alphaxiv overview (`auto` default; `--source alphaxiv` to pin)	`WebFetch https://alphaxiv.org/overview/<arxiv_id>.md` — LLM-optimized summary, often best for filling sections

wiki-enrich

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes