A modern, native toolchain for DAISY 2.02 talking books and EPUB 3 accessible publications. Single static binary. No JVM. No XSLT runtime.
Status: M0.5–M6 shipped, plus the Tier 1 1.0-readiness items: ACE accessibility validation, automatic cover lookup, parallel batch conversion, and stable JSON output for CI/pipeline use. M7 (WASM) and M8 (1.0 release) are the remaining milestones.
dpub produces production-ready EPUB 3 talking books from DAISY 2.02 sources, with both spec compliance and accessibility validated by default. It bundles EPUBCheck (the W3C tool for EPUB conformance) and ACE (the DAISY Consortium's WCAG checker) so producers can ship books that pass both validators in one pipeline — relevant for EU accessibility production workflows including those affected by the European Accessibility Act.
The reference toolchain for this conversion is the DAISY Pipeline 2 — excellent, mature, Java-based, ~150 MB of runtime, built on XSLT/XProc, and its daisy202-to-epub3 script currently produces output with two known EPUBCheck errors plus 31 warnings on a real audiobook. dpub takes a different shape:
- Production-ready by default — output is EPUBCheck-clean (0 errors / 0 warnings) on the reference 11h45m audiobook where Pipeline 2's output emits 2 errors and 31 warnings on the same input. No downstream remediation work.
- Validators bundled —
dpub validateanddpub a11yrun EPUBCheck and ACE on any EPUB, including dpub's own output. Both surfaces emit stable JSON via--jsonfor CI/pipeline use. - Audio-aware — optional MP3 → Opus re-encoding (≈2.5× smaller files at speech-friendly settings), local Whisper transcription with prose-shaped paragraph cleanup, automatic cover lookup via Open Library, all within one binary.
- Catalogue-ready —
dpub batchwalks a directory and converts every DAISY book in parallel, with per-book error recovery and a JSON summary on stdout. Pipeline 2 has no batch mode. - Tiny native binary — a single static binary, around 5 MB once 1.0 ships. No JVM startup. No XSLT runtime. Embeddable as a Rust library, with a planned WebAssembly build.
Five commands from a fresh clone to a fully-working dpub:
# macOS — install build + runtime prerequisites
brew install cmake epubcheck ffmpeg
npm install -g @daisy/ace # optional: enables `dpub a11y`
# Build with the right GPU acceleration for the host
git clone https://github.com/11ways/dpub && cd dpub
./scripts/build.sh
# Download a Whisper model (only needed if you'll use --transcribe)
./target/release/dpub setup --whisper-model medium
# Confirm everything's green
./target/release/dpub doctor./scripts/build.sh auto-detects Apple Silicon (Metal) / Linux+nvcc (CUDA) / falls back to CPU-only. Power users who want different feature flags call cargo build --release -p dpub-cli directly.
dpub doctor shows the status of every prerequisite with platform-specific install hints. dpub setup --whisper-model <size> downloads a Whisper model into ~/.cache/dpub/models/ with SHA256 verification — --transcribe then auto-discovers the most recent cached model so you don't have to thread --whisper-model <path> through every invocation. Sizes: tiny, base, small, medium (recommended for Dutch), large-v3.
git clone https://github.com/11ways/dpub
cd dpub
cargo build --release./target/release/dpub info /path/to/daisy/dpub info accepts either an ncc.html file directly or a directory containing one. Example output for a real Vlaams DAISY 2.02 audiobook:
Title: Ontmoetingen in het donker
Creator: Geertje De Ceuleneer
Date: 2008-05-13
Language: nl
Multimedia: audioNCC
Total time: 11:45:09
Navigation:
Headings: 30 (h1: 19, h2: 11)
Pages: 334
SMIL:
Sections: 30
Synch points: 364
Audio clips: 10532
Audio total: 11:45:09
Audio files: 30
./target/release/dpub convert /path/to/daisy/ -o book.epubAdd validation and audio recompression and a cover lookup, all in one pass:
./target/release/dpub convert /path/to/daisy/ \
-o book.epub \
--audio opus --bitrate 32 \
--auto-cover \
--validate --a11yFor audio-only DAISY books, transcribe to fill the text layer (requires a GGML Whisper model; enable Apple-Silicon Metal acceleration via cargo build --release --features metal):
./target/release/dpub convert /path/to/daisy/ \
-o book.epub \
--transcribe nl \
--whisper-model ~/models/ggml-medium.bin./target/release/dpub batch /path/to/daisy-books/ \
-o /path/to/output/ \
--jobs 4 \
--audio opus --bitrate 32Walks the input directory for every ncc.html, converts each book in parallel via rayon, emits a JSON summary on stdout. Per-book errors are recorded but never halt the queue. Exit code is non-zero when any book failed.
./target/release/dpub validate book.epub
./target/release/dpub a11y book.epub
./target/release/dpub validate book.epub --json | jq .epubcheck.summarydpub validate requires epubcheck on PATH (brew install epubcheck, or download from w3c/epubcheck). dpub a11y requires ace (npm install -g @daisy/ace).
Tip — opening the converted EPUB. Most generic readers (including Apple Books) display EPUB 3 but won't play Media Overlays, so an audio DAISY converted by
dpubwill look like a silent shell. Use Thorium Reader (free, open-source, EDRLab) — the reference Media Overlays implementation — to get synced text-and-audio playback. On macOS:brew install --cask thorium.
| Milestone | Scope |
|---|---|
| M0.5 | dpub info — read NCC metadata and nav summary. ✅ |
| M1 | Full DAISY parser: NCC + master.smil + per-section SMIL + audio metadata; structural round-trip. ✅ |
| M2 | Minimal EPUB 3 writer (audio-only with Media Overlays). EPUBCheck-clean. ✅ |
| M3 | End-to-end: dpub convert <ncc.html> -o out.epub. ✅ |
| M4 | Built-in validation (EPUBCheck + ACE) — dpub validate, dpub a11y. ✅ |
| M5 | Audio recompression (MP3 → Opus) — dpub convert --audio opus --bitrate <kbps>. ✅ |
| M6 | Whisper transcription for audio-only books — dpub convert --transcribe <lang> --whisper-model <path>. ✅ (segments are merged into prose-shaped paragraphs by default; pass --no-text-cleanup for raw output) |
| M6.5 | Word-level Media Overlay sync — karaoke-style highlight-along-with-audio in reading systems that honour Media Overlays. Default-on with --transcribe; pass --no-word-sync to fall back to per-paragraph sync. ✅ |
| Tier 1 polish | Whisper model caching, cover lookup (--cover and --auto-cover), parallel batch mode, JSON output for validators. ✅ |
| M7 | WASM build for browser-based conversion (planned scope: info + validate only — Whisper / ffmpeg are too heavy for a browser tab). |
| M8 | 1.0 release: signed binaries for macOS / Linux / Windows. |
dpub/
├── crates/
│ ├── dpub-core/ # DAISY 2.02 model + parser
│ ├── epub3-writer/ # EPUB 3 model + ZIP serialiser (Media Overlays, cover image)
│ ├── dpub-convert/ # DAISY 2.02 → EPUB 3 conversion driver
│ ├── dpub-validate/ # EPUBCheck + ACE wrappers, structured Report
│ ├── dpub-audio/ # ffmpeg-backed MP3 → Opus re-encoder
│ ├── dpub-whisper/ # local Whisper transcription (whisper.cpp via FFI)
│ ├── dpub-meta/ # external metadata lookup (Open Library covers)
│ ├── dpub-util/ # tiny shared utilities (XML escaping)
│ └── dpub-cli/ # `dpub` binary
└── ...
dpub-wasm lands with M7.
A few opt-in environment variables turn on integration tests that need external assets or are slow:
| Variable | Effect |
|---|---|
DPUB_TEST_BOOK=/path/to/ncc.html |
Enables the round-trip and conversion tests against a real DAISY book on disk. |
DPUB_TEST_OPUS=1 (with DPUB_TEST_BOOK) |
Enables the full-book Opus re-encode test (slow — minutes). |
DPUB_TEST_WHISPER_MODEL=/path/ggml-*.bin and DPUB_TEST_AUDIO=/path/audio.mp3 |
Enable the Whisper smoke test in dpub-whisper. Optional DPUB_TEST_WHISPER_LANG=nl. |
epubcheck on PATH |
The dpub-validate and epub3-writer integration tests will run EPUBCheck and assert zero errors. They skip silently if the binary is missing. |
ace on PATH |
(No CI-gated test yet.) Enables dpub a11y end-to-end. Install with npm install -g @daisy/ace. |
DPUB_TEST_OPENLIBRARY=1 |
Enables the live Open Library cover-lookup smoke test in dpub-meta. Skipped on CI to avoid flakiness on third-party-service availability. |
ffmpeg on PATH |
The dpub-audio and Opus re-encoding tests run; they skip silently otherwise. |
cmake on PATH |
Required to build dpub-whisper (and therefore dpub-cli once it depends on it). The whisper-rs-sys crate compiles whisper.cpp from source. |
Without any of these, cargo test still runs the full unit test suite
on every platform.
See CONTRIBUTING.md. All contributors are expected to follow the Code of Conduct.
Dual-licensed under either of:
- Apache License, Version 2.0 (
LICENSE-APACHE) - MIT license (
LICENSE-MIT)
at your option. This is the standard Rust dual-license, allowing maximum compatibility with downstream projects.
The DAISY format and the DAISY Consortium have been the global standard for accessible publications for decades. dpub stands on the shoulders of decades of work by the consortium and the Pipeline 2 team — we use their reference output as a correctness baseline.
Maintained by Eleven Ways, a Belgian digital accessibility consultancy.