A fork of Yuki which is in turn a fork of Mokuro. Automatically OCRs manga pages and outputs the results to a TXT file.
Requires Python 3.10 or higher.
- Clone the repository by using git or downloading the zip
- Install the required packages with
pip install -r requirements.txt
- Place pages in the
images
folder - Run
main.py
withpython main.py
The outputs will have the same name as the pages and will be exported in the text
folder. There is also a file named combined_text.txt
that combines all the outputs into one with page headers.
Note
Remember that the OCR is not 100% accurate and you will still need to manually correct mistakes. This simply exists to speed up the process.
- Combine all the outputs into a single file, with headers signifying which sections are from
- Have the OCR follow the text in "reading order" (This is probably complicated as hell to do, so it's just here as an "idea" of sorts.)