This project is a demo using the SmolDocling-256M model to perform document understanding tasks. Allowing you to transform document images into structured formats like Markdown and JSON, and more!
8mb.video-JVQ-aCGTYemB.mp4
- Intelligent Content Extraction: Extracts structures from documents, like:
- Tables
- Math formulas (converted to LaTeX)
- Code blocks
- Structured Output: Converts document content into markdown and JSON
- Region-Specific Processing: Select a specific area of the document to process only the content you need.
- Fully Offline: All processing happens on your device in the browser. Your data never leaves your computer.
To run this project locally, follow these steps:
-
Clone the repo
git clone https://github.com/callbacked/smoldocling256M-webgpu
-
Navigate to the project directory
cd smoldocling256M-webgpu
-
Install NPM packages
npm install
-
Run
npm run dev
This will start the Vite development server, and you can view the application at http://localhost:5173
(or another port if 5173 is in use).
To create a production build:
npm run build