A privacy-first document image extractor that runs entirely in your browser. Extract images from documents and eBooks without uploading files to any server.
- Privacy by design — All processing happens in your browser.
- Wide format support — DOCX, PPTX, XLSX, Keynote, Pages, Numbers, EPUB, MOBI, AZW3.
- Lossless extraction — Images extracted directly from document structures.
- Smart filtering — Automatically removes icons and thumbnails under 10KB.
- Batch download — Export all images as a single ZIP file.
Limitations: Older Office formats (.doc, .ppt, .xls) and DRM-protected eBooks are not supported.
- Node.js 18 or higher
- npm, yarn, or pnpm
Clone the repository and install dependencies:
git clone https://github.com/Eyozy/docex.git
cd docex
npm installStart the development server:
npm run devOpen your browser and navigate to http://localhost:5173
Build for production:
npm run buildPreview the production build:
npm run previewThe build output will be in the dist directory, ready for deployment to any static hosting service.
docex/
├── src/
│ ├── components/ # UI components
│ ├── composables/ # Composition functions
│ ├── workers/ # Background processing
│ ├── utils/ # Helper functions
│ └── i18n/ # Translations
├── public/
└── vite.config.ts
- Add file signature detection in
src/workers/extractor.worker.ts - Implement the extraction logic in the appropriate parser
- Update UI translations in
src/i18n/en-US.tsandzh-CN.ts - Add the format extension to accepted types in
src/components/DropZone.vue
Contributions are welcome. Please feel free to submit a pull request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes
- Push to the branch (
git push origin feature/amazing-feature) - Open a pull request
MIT License — see the LICENSE file for details.