Grimoire · 魔典 — Document → Notes + Skill Pack

One source, parsed once, woven into typed reading notes (→ Obsidian) and a per-source skill pack. Primary entry point: scripts/grimoire.sh. Full product overview and bilingual docs: see README.md.

The sections below are the underlying MinerU parsing API reference that the grimoire toolchain is built on.

Flow: Markdown first, the grimoire is opt-in · 先转 MD,魔典是可选续跑

The pipeline is gated, not automatic:

Always first → Markdown. A source becomes Markdown via pdf2md / mineru-local (or any MinerU parse). For many requests this is the whole job — stop here. Do not auto-run notes/skill mining.
Opt-in → the grimoire. Only when the user actually wants reading notes and/or a packaged skill, continue from that Markdown with grimoire.sh --from-markdown <md>. Nothing is re-uploaded; the parse is skipped. --only notes|skills|both is the second opt-in: notes only, the skill (engineering-prompt) packaging only, or both.
--only both is two-stage, not parallel. Stage 1 writes the type-specific notes from the source. Stage 2 is a deliberate re-learning pass (重复学习): the agent re-reads the notes it just wrote and mines the skill pack from those notes' knowledge points, not the raw source (source-markdown/ is then only for evidence anchors). --only skills (no notes) keeps mining from the source.

默认只到 MD;笔记 / 把内容封装成 skill 工程提示属于用户主动选择的下游, 用 --from-markdown 从已有 MD 续跑,不重新上传、不重复解析。 --only both 是两段式:先写笔记,再「重复学习」从笔记里挖技能包, 不是同段并行。

This keeps the live pdf2md / mineru-local path unchanged — Grimoire does not steal its triggers; it is the deliberate next step after Markdown.

Overview

MinerU converts PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG, HTML into machine-readable Markdown/JSON. Supports OCR (109 languages), formula/table recognition, cross-page table merging, and batch processing.

This skill can also stage a parsed long-form source as a source skill pack: a workspace that a large language model can read to extract candidate agent skills, grouped by source first and then by chapter, lesson, section, or note. The workflow does not call an LLM and does not install generated skills automatically.

It can additionally run a source-to-notes pipeline: parse a document, auto-classify it as book / paper / document (hybrid heuristic + AI confirmation), scaffold the matching note discipline, and stage the final note for the Obsidian Knowledge-Hub vault. The agent does the reading and writing; the scripts never call an LLM and never write into the vault directly.

Two modes:

Cloud API — https://mineru.net/api/v4 (no GPU required, token-based)
Local API — mineru-api --port 8000 (self-hosted, requires GPU or CPU backend)

Authentication (Cloud API)

Token file: ~/.config/mineru/token
Header: Authorization: Bearer <token>
Get token: https://mineru.net/apiManage/token

mkdir -p ~/.config/mineru
echo "YOUR_TOKEN" > ~/.config/mineru/token
chmod 600 ~/.config/mineru/token

Limits (Cloud API)

Item	Limit
Single file size	200MB max
Single file pages	600 pages max
Daily priority pages	2000 pages/account
Batch upload	200 files/request
Token validity	90 days

Model Versions

Model	Use Case	Speed	Notes
`vlm`	Default. MinerU2.5, complex layouts, highest accuracy	Slower	Cloud-recommended; needs GPU locally
`pipeline`	General documents, CPU-friendly	Fast	Pure CPU support, lower accuracy
`MinerU-HTML`	HTML output, preserves formatting	Medium	For web content
`hybrid`	~~Default pre-2026-04~~	—	RETIRED on the cloud API — returns `code -10002 "version field invalid"`. Do not use against `mineru.net`.

API Endpoints (Cloud)

Base URL: https://mineru.net/api/v4

1. Create Extraction Task (Single File)

POST /extract/task

Param	Type	Required	Default	Description
url	string	yes	-	File URL (no direct upload)
model_version	string	no	vlm	`vlm` / `pipeline` / `MinerU-HTML` (`hybrid` retired, returns `-10002`)
is_ocr	bool	no	false	Enable OCR
enable_formula	bool	no	true	Formula recognition
enable_table	bool	no	true	Table recognition
language	string	no	ch	Document language
data_id	string	no	-	Custom identifier
page_ranges	string	no	-	e.g. "2,4-6"
callback	string	no	-	Callback URL for async results
extra_formats	array	no	-	`["docx"]`, `["html"]`, `["latex"]`

Response:

{"code": 0, "data": {"task_id": "xxx"}, "msg": "ok"}

2. Get Task Results

GET /extract/task/{task_id}

States: pending → running → done / failed / converting

Done response: includes full_zip_url (download link)

3. Batch Upload Local Files

POST /file-urls/batch

Returns presigned upload URLs (valid 24h). System auto-submits extraction after upload.

4. Batch URL Extraction

POST /extract/task/batch

Submit multiple URLs at once, returns batch_id.

5. Batch Results

GET /extract-results/batch/{batch_id}

Local API (Self-Hosted)

Start Server

# FastAPI server
mineru-api --host 0.0.0.0 --port 8000

# Gradio WebUI
mineru-gradio --server-name 0.0.0.0 --server-port 7860

# OpenAI-compatible server (for remote VLM inference)
mineru-openai-server --port 30000

Environment Variables

Variable	Description	Default
`MINERU_MODEL_SOURCE`	Model source: `modelscope` / `huggingface`	huggingface
`MINERU_API_MAX_CONCURRENT_REQUESTS`	Max concurrent API requests	Unlimited
`MINERU_API_ENABLE_FASTAPI_DOCS`	Enable /docs page	true

Local API Docs

Access at http://127.0.0.1:8000/docs after starting.

Use CLI with Remote Server

mineru -p input.pdf -o output/ -b hybrid-http-client -u http://server:30000

Error Codes

Code	Issue	Fix
A0202	Token invalid	Check Bearer prefix and token
A0211	Token expired	Recreate at mineru.net
-60002	Unrecognized format	Check file extension
-60005	File too large	Max 200MB
-60006	Too many pages	Max 600, split document
-60008	URL timeout	Check URL accessibility
-60012	Task not found	Verify task_id

Helper Script

~/.claude/skills/mineru/scripts/mineru-parse.sh — full-featured CLI wrapper.

# URL mode
mineru-parse.sh https://example.com/doc.pdf

# Local file with options
mineru-parse.sh /path/to/file.pdf --model vlm --ocr --output /tmp/result

# Extra formats
mineru-parse.sh doc.pdf --format docx --format latex

# Page ranges
mineru-parse.sh doc.pdf --pages "1-5,8" --output ./results

# Auto-extract markdown from zip
mineru-parse.sh doc.pdf --output ./results --extract

# Extract without printing book-sized markdown and write a local manifest
mineru-parse.sh book.pdf --output ./results --extract --no-print-md --manifest ./results/parse_manifest.json

Source-to-Skill Workflow

Use this when the user uploads a book, course, paper, manual, article collection, or other long-form text and wants the agent to learn which reusable skills can be extracted from it.

Commands

# Local files are uploaded to the MinerU cloud API; --cloud-ok is required.
~/.claude/skills/mineru/scripts/mineru-source-to-skill.sh /path/to/book.pdf \
  --title "Book Title" \
  --type auto \
  --output ./source-workspaces \
  --cloud-ok

# If the source is already parsed to Markdown, stage a pack directly.
~/.claude/skills/mineru/scripts/source-skill-pack.sh ./mineru-extracted/book \
  --title "Book Title" \
  --type auto \
  --output ./source-skill-packs

Comp

grimoire