Grimoire · 魔典 — Document → Notes + Skill Pack
One source, parsed once, woven into typed reading notes (→ Obsidian) and a per-source skill pack. Primary entry point:
scripts/grimoire.sh. Full product overview and bilingual docs: see README.md.The sections below are the underlying MinerU parsing API reference that the
grimoiretoolchain is built on.
Flow: Markdown first, the grimoire is opt-in · 先转 MD,魔典是可选续跑
The pipeline is gated, not automatic:
- Always first → Markdown. A source becomes Markdown via
pdf2md/mineru-local(or any MinerU parse). For many requests this is the whole job — stop here. Do not auto-run notes/skill mining. - Opt-in → the grimoire. Only when the user actually wants reading notes
and/or a packaged skill, continue from that Markdown with
grimoire.sh --from-markdown <md>. Nothing is re-uploaded; the parse is skipped.--only notes|skills|bothis the second opt-in: notes only, the skill (engineering-prompt) packaging only, or both. --only bothis two-stage, not parallel. Stage 1 writes the type-specific notes from the source. Stage 2 is a deliberate re-learning pass (重复学习): the agent re-reads the notes it just wrote and mines the skill pack from those notes' knowledge points, not the raw source (source-markdown/is then only for evidence anchors).--only skills(no notes) keeps mining from the source.
默认只到 MD;笔记 / 把内容封装成 skill 工程提示属于用户主动选择的下游, 用
--from-markdown从已有 MD 续跑,不重新上传、不重复解析。--only both是两段式:先写笔记,再「重复学习」从笔记里挖技能包, 不是同段并行。
This keeps the live pdf2md / mineru-local path unchanged — Grimoire does
not steal its triggers; it is the deliberate next step after Markdown.
Overview
MinerU converts PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG, HTML into machine-readable Markdown/JSON. Supports OCR (109 languages), formula/table recognition, cross-page table merging, and batch processing.
This skill can also stage a parsed long-form source as a source skill pack: a workspace that a large language model can read to extract candidate agent skills, grouped by source first and then by chapter, lesson, section, or note. The workflow does not call an LLM and does not install generated skills automatically.
It can additionally run a source-to-notes pipeline: parse a document,
auto-classify it as book / paper / document (hybrid heuristic + AI
confirmation), scaffold the matching note discipline, and stage the final note
for the Obsidian Knowledge-Hub vault. The agent does the reading and writing;
the scripts never call an LLM and never write into the vault directly.
Two modes:
- Cloud API —
https://mineru.net/api/v4(no GPU required, token-based) - Local API —
mineru-api --port 8000(self-hosted, requires GPU or CPU backend)
Authentication (Cloud API)
- Token file:
~/.config/mineru/token - Header:
Authorization: Bearer <token> - Get token: https://mineru.net/apiManage/token
mkdir -p ~/.config/mineru
echo "YOUR_TOKEN" > ~/.config/mineru/token
chmod 600 ~/.config/mineru/token
Limits (Cloud API)
| Item | Limit |
|---|---|
| Single file size | 200MB max |
| Single file pages | 600 pages max |
| Daily priority pages | 2000 pages/account |
| Batch upload | 200 files/request |
| Token validity | 90 days |
Model Versions
| Model | Use Case | Speed | Notes |
|---|---|---|---|
vlm | Default. MinerU2.5, complex layouts, highest accuracy | Slower | Cloud-recommended; needs GPU locally |
pipeline | General documents, CPU-friendly | Fast | Pure CPU support, lower accuracy |
MinerU-HTML | HTML output, preserves formatting | Medium | For web content |
hybrid | — | RETIRED on the cloud API — returns code -10002 "version field invalid". Do not use against mineru.net. |
API Endpoints (Cloud)
Base URL: https://mineru.net/api/v4
1. Create Extraction Task (Single File)
POST /extract/task
| Param | Type | Required | Default | Description |
|---|---|---|---|---|
| url | string | yes | - | File URL (no direct upload) |
| model_version | string | no | vlm | vlm / pipeline / MinerU-HTML (hybrid retired, returns -10002) |
| is_ocr | bool | no | false | Enable OCR |
| enable_formula | bool | no | true | Formula recognition |
| enable_table | bool | no | true | Table recognition |
| language | string | no | ch | Document language |
| data_id | string | no | - | Custom identifier |
| page_ranges | string | no | - | e.g. "2,4-6" |
| callback | string | no | - | Callback URL for async results |
| extra_formats | array | no | - | ["docx"], ["html"], ["latex"] |
Response:
{"code": 0, "data": {"task_id": "xxx"}, "msg": "ok"}
2. Get Task Results
GET /extract/task/{task_id}
States: pending → running → done / failed / converting
Done response: includes full_zip_url (download link)
3. Batch Upload Local Files
POST /file-urls/batch
Returns presigned upload URLs (valid 24h). System auto-submits extraction after upload.
4. Batch URL Extraction
POST /extract/task/batch
Submit multiple URLs at once, returns batch_id.
5. Batch Results
GET /extract-results/batch/{batch_id}
Local API (Self-Hosted)
Start Server
# FastAPI server
mineru-api --host 0.0.0.0 --port 8000
# Gradio WebUI
mineru-gradio --server-name 0.0.0.0 --server-port 7860
# OpenAI-compatible server (for remote VLM inference)
mineru-openai-server --port 30000
Environment Variables
| Variable | Description | Default |
|---|---|---|
MINERU_MODEL_SOURCE | Model source: modelscope / huggingface | huggingface |
MINERU_API_MAX_CONCURRENT_REQUESTS | Max concurrent API requests | Unlimited |
MINERU_API_ENABLE_FASTAPI_DOCS | Enable /docs page | true |
Local API Docs
Access at http://127.0.0.1:8000/docs after starting.
Use CLI with Remote Server
mineru -p input.pdf -o output/ -b hybrid-http-client -u http://server:30000
Error Codes
| Code | Issue | Fix |
|---|---|---|
| A0202 | Token invalid | Check Bearer prefix and token |
| A0211 | Token expired | Recreate at mineru.net |
| -60002 | Unrecognized format | Check file extension |
| -60005 | File too large | Max 200MB |
| -60006 | Too many pages | Max 600, split document |
| -60008 | URL timeout | Check URL accessibility |
| -60012 | Task not found | Verify task_id |
Helper Script
~/.claude/skills/mineru/scripts/mineru-parse.sh — full-featured CLI wrapper.
# URL mode
mineru-parse.sh https://example.com/doc.pdf
# Local file with options
mineru-parse.sh /path/to/file.pdf --model vlm --ocr --output /tmp/result
# Extra formats
mineru-parse.sh doc.pdf --format docx --format latex
# Page ranges
mineru-parse.sh doc.pdf --pages "1-5,8" --output ./results
# Auto-extract markdown from zip
mineru-parse.sh doc.pdf --output ./results --extract
# Extract without printing book-sized markdown and write a local manifest
mineru-parse.sh book.pdf --output ./results --extract --no-print-md --manifest ./results/parse_manifest.json
Source-to-Skill Workflow
Use this when the user uploads a book, course, paper, manual, article collection, or other long-form text and wants the agent to learn which reusable skills can be extracted from it.
Commands
# Local files are uploaded to the MinerU cloud API; --cloud-ok is required.
~/.claude/skills/mineru/scripts/mineru-source-to-skill.sh /path/to/book.pdf \
--title "Book Title" \
--type auto \
--output ./source-workspaces \
--cloud-ok
# If the source is already parsed to Markdown, stage a pack directly.
~/.claude/skills/mineru/scripts/source-skill-pack.sh ./mineru-extracted/book \
--title "Book Title" \
--type auto \
--output ./source-skill-packs
Comp