A fast comic cleaner/typesetter/translator utility.
- Open and edit
.jpg
,.png
,.tiff
,.zip
,.rar
,.cbz
,.cbr
images/archives. - Recognize text using optical character recognition.
- Content-aware cleaning using advance inpainting techniques.
- Mask-based cleaning using text boxes.
- Translate text using Google Translate, straight from the software.
- Add text bubbles and change font size and family.
- Easily export and share text boxes and text bubbles with others.
- Immediately export cleaned and typeset pages.
Sounds too good to be true? Check out this demo: to be added.
Simply run the installer for your platform, and you can use the core features.
If you want to use OCR (optical character recognition) features, you will also need to install Tesseract-OCR. If you don't have Tesseract installed, the app will warn you that OCR features are unavailable each time you launch it, and provides additional instructions for each platform to install it. The same platform-specific instructions are repeated here:
Install Brew or MacPorts and run the following command:
# brew
brew install tesseract
# ports
sudo port install tesseract
Use the Tesseract installer from UB Mannheim. In particular, install tesseract-ocr-w64-setup-v4.1.0.20190314.exe
(simply go through the installation without checking any additional options). Locate the folder where Tesseract-OCR is installed (usually C:\Program Files\Tesseract-OCR
or C:\Program Files (x86)\Tesseract-OCR
), and add this to your PATH variable as follows:
- Press the Windows button and search for 'Edit the system environment variables'.
- Click the 'Environment Variables' button on the bottom left.
- Select 'Path' in the 'System variables' list, and press 'Edit'.
- In the window that just opened, press 'New', and paste the path to Tesseract-OCR.
That's it, you should be good to go now.
Install Tesseract-OCR following the instructions here. Be sure to install Tesseract v4, since this is directly compatible with the bundled language files.
If you are on Ubuntu, simply run sudo apt install tesseract-ocr
.
More detailed instructions will be added later. For now, check this set of shortcuts out:
Keystroke | Action |
---|---|
I / J / K / L | Move selected; hold shift to nudge |
A | Add box |
shift+A | Select all items |
D / bcksp / del | Remove selected items |
shift+{D / bcksp / del} | Remove flagged items |
E | Edit box |
F | Flag selected |
shift+F | Unflag selected |
G | Group selected |
shift+G | Ungroup selected |
H | Hide selected |
shift+H | Unhide all |
S | OCR rubberband selection |
shift+S | OCR rubberband tight selection |
R | Restore selected |
W | Inpaint black |
shift+W | Inpaint white |
[ | Scale down |
shift+[ | Fine scale down |
] | Scale up |
shift+] | Fine scale up |
Note: Replace ctrl by command on macOS.
Keystroke | Action |
---|---|
ctrl+E | Export LSTMBox |
ctrl+shift+E | Export TXTEll |
ctrl+I | Page information |
ctrl+L | Load LSTMBox |
ctrl+shift+L | Load TXTEll |
ctrl+O | Open image/archive |
ctrl+P | Prescan page |
ctrl+S | Save cleaned image |
ctrl+shift+S | Save current scene as image |
ctrl+T | Translate page |
You can access the config.json
file in your installation folder (exact location depends on your operating system). The different settings should be somewhat obvious, but detailed documentation will follow soon.
One thing to look out for is the language
tag. The default values are for vertical Japanese text, but you can change this by setting "isVertical" : false
and changing the language
tag to match one of the files in the tessdata
folder. You can download more .traineddata
files here; simply add the files to the tessdata
folder in the installation location.
The translation language is easily changed by changing the translationLanguage
tag to a two-letter language code. These language codes can be found in translator_constants.py
. The source language is set to be detected automatically at the moment, although we will probably add a feature to change this manually.
Note: This section will be greatly expanded upon in the future.
We use a modified version of Tesseract's LSTM .box
format (see here for more info), encoding the box to which each glyph belongs, as well as what group it belongs to, if any. This format has the following form:
<symbol> <left> <bottom> <right> <top> <group>
and uses the .box
extension.
We use a special character, known as the unit separator ('␟', U+001F), to prevent a textbox from being resized based on its content. This allows you to create masking boxes that only serve to cover text, but are ignored during collation/translation.
We call text bubbles 'ellipsoids' in RetCom, to not confuse them with textboxes. Ellipsoids are encoded in such a way so as to store the width and length, position, font size and family, and content of the ellipsoid.
We follow the following convention:
<content> <size> <family> <color> <x> <y> <w> <h>
and we use .ell
. For details on serialization, check the source code.
You can simply use HTML formatting in the text bubbles, so bold text would be <b>bold</b>
, and italics would be <i>italics</>
. Try out other HTML tags and see what work for yourself!
Sure! Just open an issue and we can talk.
Same as above.
Maybe, open an issue and say how you would go about doing it and why it's useful.
Besides Tesseract-OCR, we use the following amazing open source dependencies:
Package | Use |
---|---|
fontTools | Character size determination |
NumPy | Numerical support |
OpenCV | Inpainting |
Pillow | Image cropping and I/O |
rarfile | RAR file support |
requests | Communication with Google Translate API |