/jisui

Convert scanned image PDF file to text annotated PDF file

Primary LanguageGoMIT LicenseMIT

Jisui (自炊)

This tool is PoC (Proof of Concept).

Jisui is a helper tool to create e-book.
Ordinary the scanned book have not text information, so you cannot search text from the PDF.
Jisui extract texts from a scanned book (PDF) and merge the text to PDF.

This tool is depending on Google Cloud Vision API to extract texts.
So you need GCP account & own project.

Jisui (自炊) is Japanese slung which means that scanning a book to make e-book.

Pre-requirements

Install

$ go get github.com/sachaos/jisui

Usage

$ jisui -bucket [your GCS bucket] -font [Downloaded font] -output result.pdf [scanned PDF file]

Example

You can see example PDF file.

Please download and open it in PDF viewer.

You can recongnize the difference when you search text.

image