/bedrock:learn — External Source Ingestion into the Second Brain
Plugin Paths
Entity definitions and templates are in the plugin directory, not at the vault root. Use the "Base directory for this skill" provided at invocation to resolve paths:
- Entity definitions:
<base_dir>/../../entities/ - Templates:
<base_dir>/../../templates/{type}/_template.md - Plugin CLAUDE.md:
<base_dir>/../../CLAUDE.md(already injected automatically into context)
Where <base_dir> is the path provided in "Base directory for this skill".
Vault Resolution
Resolve which vault to learn. This skill can be invoked from any directory.
Step 1 — Parse --vault flag:
Check if the input arguments include --vault <name>. If found, extract the vault name and remove it from the arguments (the remaining text is the source URL/path).
Step 2 — Resolve vault path:
-
If
--vault <name>was provided: Read the vault registry at<base_dir>/../../vaults.json. Find the entry matching the name. If not found: error — "Vault<name>is not registered. Run/bedrock:vaultsto see available vaults." If found: setVAULT_PATHto the entry'spathvalue. Store the resolved vault name asVAULT_NAME. -
If no
--vaultflag — CWD detection: Read<base_dir>/../../vaults.json. Check if the current working directory is inside any registered vault path (CWD starts with a registered vault's absolute path). If multiple match, use the longest path (most specific). If found: setVAULT_PATHto the matching vault'spath. Store its name asVAULT_NAME. -
If CWD detection fails — default vault: From the registry, find the vault with
"default": true. If found: setVAULT_PATHto the default vault'spath. Store its name asVAULT_NAME. -
If no resolution: Error — "No vault resolved. Available vaults:" followed by the registry listing. "Use
--vault <name>to specify, or run/bedrock:setupto register a vault."
Step 3 — Validate vault path:
test -d "<VAULT_PATH>" && echo "exists" || echo "missing"
If missing: error — "Vault path <VAULT_PATH> does not exist on disk. Run /bedrock:setup to re-register."
Step 4 — Read vault config:
cat <VAULT_PATH>/.bedrock/config.json 2>/dev/null
Extract language and other relevant fields for use in later phases.
From this point forward, ALL vault file operations use <VAULT_PATH> as the root.
- Graphify output:
<VAULT_PATH>/graphify-out/ - When delegating to
/bedrock:preserve, pass--vault <VAULT_NAME>
Overview
This skill receives an external source (URL or local path), fetches its content to a temporary
directory, converts non-markdown files to markdown via docling, runs the /graphify extraction
pipeline on the tmp content, and delegates entity persistence (plus graphify-output merge) to
/bedrock:preserve.
You are a fetcher and orchestrator agent. Your job is to:
- Ensure docling is installed (auto-install if missing)
- Classify the input and fetch content to
/tmp - Convert fetched files to markdown via docling (when applicable)
- Invoke
/graphifyto extract a knowledge graph into a per-run temp directory - Delegate graph merge and entity writes to
/bedrock:preserve - Clean up temporary files
You do NOT classify entities, create vault files, write to the vault directly, or merge graph state.
All extraction is done by /graphify. All writes (including the graphify-output merge into the
vault's cumulative graphify-out/) are done by /bedrock:preserve.
Follow the phases below in order, without skipping steps.
Phase 0 — Ensure docling is installed
Before any fetch or conversion, verify that the docling CLI is available. If missing, install
it silently using the same fallback chain /bedrock:setup uses for graphify, emitting a single
status line before proceeding.
if command -v docling >/dev/null 2>&1; then
echo "Phase 0: docling already installed — proceeding."
else
echo "Phase 0: docling not found — installing silently (one-time setup, may take a few minutes for model download)."
# Step 1 — pipx (preferred, isolated)
if command -v pipx >/dev/null 2>&1; then
pipx install docling >/dev/null 2>&1 || true
fi
# Step 2 — pip (fallback if pipx unavailable or failed)
if ! command -v docling >/dev/null 2>&1; then
if command -v pip3 >/dev/null 2>&1; then
pip3 install --user docling >/dev/null 2>&1 || true
elif command -v pip >/dev/null 2>&1; then
pip install --user docling >/dev/null 2>&1 || true
fi
fi
# Final re-probe
if ! command -v docling >/dev/null 2>&1; then
echo "ERROR: docling install failed. Run /bedrock:setup to install it, or install manually: pipx install docling"
exit 1
fi
echo "Phase 0: docling installed."
fi
Failure mode: If install fails (no pipx/pip, network outage, permission denied), abort
the skill with the error above. Do NOT fetch or mutate anything. Direct the user to /bedrock:setup.
No user prompt: this step is silent — one status line on success, one error line on failure.
Phase 1 — Fetch
1.1 Classify the input
The user provides an argument. Classify it in the following priority order. URL-type routing is unchanged; local files no longer have an extension allowlist — any existing file is accepted, and Phase 1.5 decides whether to run docling on it.
| Input | Detected type | Fetch method |
|---|---|---|
URL containing confluence or atlassian.net | confluence | Read skills/confluence-to-markdown/SKILL.md, follow instructions, save output to tmp |
URL containing docs.google.com | gdoc | Read skills/gdoc-to-markdown/SKILL.md, follow instructions, save output to tmp |
URL containing github.com | github-repo | git clone --depth 1 to tmp + GitHub MCP enrichment (docling never runs on GitHub repos) |
URL starting with http:// or https:// (any other) | remote-binary | Download raw bytes to tmp via curl/WebFetch; Phase 1.5 decides conversion |
| Local file path (any existing file) | local-file | Copy to tmp; Phase 1.5 decides conversion |
| Local directory path | local-dir | Copy directory to tmp |
| No match above | manual | Ask the user: "Could not identify the source type. Paste the content or provide a valid URL/path." |
If no argument was provided: ask the user "What source do you want to ingest? Provide a URL (Confluence, Google Docs, GitHub, or any HTTP(S) URL) or a local file path (any file type — docling will convert it to markdown if supported)."
1.2 Create temporary directory
All content is fetched to a temporary directory. This is the single input path for /graphify.
LEARN_TMP="/tmp/bedrock-learn-$(date +%s)"
mkdir -p "$LEARN_TMP"
echo "Temporary directory: $LEARN_TMP"
Store the path for use in subsequent phases.
1.3 Fetch content
Execute the fetch strategy for the detected type. All content lands in $LEARN_TMP/.
1.3.1 GitHub repository
For GitHub URLs (e.g.: https://github.com/acme-corp/billing-api):
- Extract
owner/repoandrepo-namefrom the URL - Clone the repository (shallow):
git clone --depth 1 <url> "$LEARN_TMP/<repo-name>" - GitHub MCP enrichment — call directly in main context (NOT via subagent — MCP permissions are not inherited):
mcp__plugin_github_github__get_file_contents→ read the repo's README.mdmcp__plugin_github_github__list_commits→ last 10 commitsmcp__plugin_github_github__list_pull_requests→ last 5 PRs (state=all, sort=updated)
- Compile MCP results into a single markdown file and save as
$LEARN_TMP/<repo-name>/_github_metadata.md
Best-effort: If any MCP call fails, continue with what was obtained. Do NOT block ingestion.
1.3.2 Confluence
For Confluence URLs:
- Read the internal skill at
<base_dir>/../confluence-to-markdown/SKILL.md - Follow its instructions to parse the URL, choose layer (M