/doc2md
Local document-to-markdown converter. Zero LLM tokens spent on conversion.
Default behavior
No arguments:
- Processes
.raw-docs/→ saves output todocs/ - Only converts new or changed files (MD5 cache)
- Audio files (.mp3, .wav) → tells user to use
/whisper
With arguments: follows the explicit command.
Usage
/doc2md # .raw-docs/ → docs/ (default)
/doc2md <file.pdf> # single file
/doc2md <folder> # entire folder (recursive)
/doc2md --url <url> # YouTube, Wikipedia, HTML
/doc2md <folder> --watch # auto-convert on new files
/doc2md <folder> --out <dir> # custom output folder
/doc2md <file> --force # skip cache
/doc2md <folder> --list # show cache status
Supported formats
.pdf .docx .pptx .xlsx .xls .csv .epub .ipynb .msg .zip
URLs: YouTube (full transcript), Wikipedia, generic HTML.
Steps when invoked
1. Check dependency
python -c "from markitdown import MarkItDown; print('ok')"
If missing: pip install markitdown[all]
2. Run the converter
python ~/.claude/skills/doc2md/scripts/convert.py <input> [flags]
Returns JSON. Parse and display as a results table.
3. Report to user
✓ report.pdf → report.md (2,309 words, ~1,732 tokens saved)
⚠️ memo.pdf (cached, 587 words)
✗ audio.mp3 → use /whisper
graphify integration
/doc2md # 0 tokens: convert docs to .md locally
/graphify docs/ # reads .md instead of PDF: 60-80% fewer tokens