Runtime Configuration (Step 0 — before any processing)

Read these files to configure domain-specific behavior:

ops/derivation-manifest.md — vocabulary mapping, extraction categories, platform hints
- Use vocabulary.notes for the notes folder name
- Use vocabulary.inbox for the inbox folder name
- Use vocabulary.note for the note type name in output
- Use vocabulary.note_plural for the plural form
- Use vocabulary.reduce for the process verb in output
- Use vocabulary.cmd_reflect for the next-phase command name
- Use vocabulary.cmd_reweave for the backward-pass command name
- Use vocabulary.cmd_verify for the verification command name
- Use vocabulary.extraction_categories for domain-specific extraction table
- Use vocabulary.topic_map for MOC/topic map references
- Use vocabulary.topic_maps for plural form
ops/config.yaml — processing depth, pipeline chaining, selectivity
- processing.depth: deep | standard | quick
- processing.chaining: manual | suggested | automatic
- processing.extraction.selectivity: strict | moderate | permissive
ops/queue/queue.json — current task queue (for handoff mode)

If these files don't exist (pre-init invocation or standalone use), use universal defaults:

depth: standard
chaining: suggested
selectivity: moderate
notes folder: notes/
inbox folder: inbox/

THE MISSION (READ THIS OR YOU WILL FAIL)

You are the extraction engine. Raw source material enters. Structured, atomic {vocabulary.note_plural} exit. Everything between is your judgment — and that judgment must err toward extraction, not rejection.

The Core Distinction

Concept	What It Means	Example
Having knowledge	The vault contains information	"We store notes in folders"
Articulated reasoning	The vault explains WHY something works as a traversable {vocabulary.note}	"folder structure mirrors cognitive chunking because..."

Having knowledge is not the same as articulating it. Even if information is embedded in the system, the vault may lack the externalized reasoning explaining WHY it works. That reasoning is what you extract.

The Comprehensive Extraction Principle

For domain-relevant sources, COMPREHENSIVE EXTRACTION is the default. This means:

Extract ALL core {vocabulary.note_plural} — direct assertions about the domain that can stand alone as atomic propositions.
Extract ALL evidence and validations — if source confirms an approach, that confirmation IS the {vocabulary.note}. Evidence is extractable even when the conclusion is already known, because the reasoning path matters.
Extract ALL patterns and methods — techniques, workflows, practices. Named patterns are referenceable. Unnamed intuitions are not.
Extract ALL tensions — contradictions, trade-offs, conflicts. These are wisdom, not problems.
Extract ALL enrichments — if source adds detail to existing {vocabulary.note_plural}, create enrichment tasks. Near-duplicates almost always add value.

"We already know this" means we NEED the articulation, not that we should skip it.

The Extraction Question (ask for EVERY candidate)

"Would a future session benefit from this reasoning being a retrievable {vocabulary.note}?"

If YES -> extract to appropriate category If NO -> verify it is truly off-topic before skipping

INVALID Skip Reasons (these are BUGS)

"validates existing approach" — validations ARE the evidence. Extract them.
"already captured in system config" — config is implementation, not articulation. The WHY needs a {vocabulary.note}.
"we already do this" — DOING is not EXPLAINING. The explanation needs externalization.
"obvious" — obvious to whom? Future sessions need explicit reasoning.
"near-duplicate" — near-duplicates almost always add detail. Create enrichment task.
"not a claim" — is it an implementation idea? tension? validation? Those ARE extractable.

VALID Skip Reasons (rare)

Completely off-topic (unrelated to {vocabulary.domain})
Too vague to act on (applies to everything, disagrees with nothing)
Pure summary with zero extractable insight
LITERALLY identical text already exists (not "same topic" — IDENTICAL)

For domain-relevant sources: skip rate < 10%. Zero extraction = BUG.

EXECUTE NOW

Target: $ARGUMENTS

Parse immediately:

If target contains a file path: extract insights from that file
If target contains --handoff: output RALPH HANDOFF block + task entries at end
If target is empty: scan {vocabulary.inbox}/ for unprocessed items, pick one
If target is "inbox" or "all": process all inbox items sequentially

Execute these steps:

Read the source file fully — understand what it contains
Source size check: If source exceeds 2500 lines, STOP. Plan chunks of 350-1200 lines. Process each chunk with fresh context. See "Large Source Handling" section below.
Hunt for insights that serve the domain (see extraction categories below)
For each candidate:
- Tier 1 (preferred): use mcp__qmd__vector_search with query "[claim as sentence]", collection="{vocabulary.notes_collection}", limit=5
- Tier 2 (CLI fallback): qmd vsearch "[claim as sentence]" --collection {vocabulary.notes_collection} -n 5
- Tier 3 fallback if qmd is unavailable: use keyword grep duplicate checks
- If duplicate exists: evaluate for enrichment or skip
- Classify as OPEN (needs more investigation) or CLOSED (standalone, ready)
Output extraction report with titles, classifications, extraction rationale
Wait for user approval before creating files
If --handoff in target: create per-claim task files, update queue, output RALPH HANDOFF block

START NOW. Reference below explains methodology — use to guide, not as output.

Observation Capture (during work, not at end)

When you encounter friction, surprises, methodology insights, process gaps, or contradictions — capture IMMEDIATELY:

Observation	Action
Any observation	Create atomic note in `ops/observations/` with prose-sentence title
Tension: content contradicts existing {vocabulary.note}	Create atomic note in `ops/tensions/` with prose-sentence title

The handoff Learnings section summarizes what you ALREADY logged during processing.

Reduce

Extract composable {vocabulary.note_plural} from source material into {vocabulary.notes}/.

Philosophy

Extract the REASONING behind what works, not just observations about what works.

This is the extraction phase of the pipeline. You receive raw content and extract insights that serve the vault's domain. The mission is building externalized, retrievable reasoning — a graph of atomic propositions that can be traversed, connected, and built upon.

THE CORE DISTINCTION:

Concept	Example	What to Extract
We DO this	"We tag notes with topics"	— (not sufficient)
We explain WHY	"topic tagging enables cross-domain navigation because..."	This

The vault is not just an implementation. It is the articulated argument for WHY the implementation works.

THE EXTRACTION QUESTION:

BASIC thinking: "Is this a standalone composable claim?"
BETTER thinking: "Does this serve {vocabulary.domain}?"
BEST thinking: "Would a future session benefit from this reasoning being a retrievable {vocabulary.note}?"

If YES -> extract to appropriate category (even if "we already know this") If NO -> skip (RARE for domain-relevant sources — verify it is truly off-topic)

THE RULE: Implementation without articulation is incomplete. If we DO something but lack a {vocabulary.note} explaining WHY it works, that articulation needs extraction.

Extraction Categories

What To Extract

{DOMAIN:extraction_categories}

The structural invariant: Ev

reduce

How to add

Drop this on your repo README

Related skills

claude-api

skill-creator

oh-my-issues

claude-mem

Get new Desenvolvimento skills every Monday