PDF Page Extract Skill
Purpose
This skill extracts all necessary data from PDF pages to enable accurate AI-driven HTML generation. It produces three critical artifacts:
- Rich extraction data - Text spans with font metadata (sizes, styles, positions)
- Rendered PNG image - Visual reference for AI to understand page layout
- Page mapping - Authoritative mapping of PDF indices to book pages
This is the deterministic, Python-based foundation for the entire pipeline. All e
[Description truncada. Veja o README completo no GitHub.]