/tppocr

Tesseract OCR of Pokemon dialog text on streaming video.

Primary LanguagePythonOtherNOASSERTION

TPPOCR

Tesseract OCR of Pokemon dialog text on streaming video.

This project contains scripts and training data needed for running OCR on live streams such as TwitchPlaysPokemon.

It has two language data files:

  • pkmngb_en: English training data for Gameboy Pokemon games such as Red, Blue, Gold, Silver, Crystal.
  • pkmngba_en: English training data for Gameboy Advanced / DS Pokemon games such as Ruby, Saphire, Emerald, FireRed, Diamond, Pearl, Platinum.

You may be interested in PokeTEXT too.

If you just want to use the training data, please skip to the bottom of this document.

Note: This project was designed for Tesseract version 3. For version 4, please see tppocr2 tppocr3.

Install

Requirements:

To run TPPOCR, it's recommended to use a Linux OS.

On Debian/Ubuntu, run the following to install stable versions provided by Debian/Ubuntu:

    apt-get install build-essential tesseract-ocr libtesseract-dev libleptonica-dev cython3 python3 redis-server python3-pip python3-pil

On Debian/Ubuntu, run the following to install the latest Python library versions on your home directory:

    pip3 install tesserocr redis livestreamer --user

Download a recent static build of FFmpeg:

    wget http://example.com/PUT_URL_HERE_TO/ffmpeg-release-64bit-static.tar.xz
    tar -xJv ffmpeg-release-64bit-static.tar.xz

TPPOCR will require running Livestreamer and FFmpeg separately. Ensure these files are in PATH environment variable:

  • ~/.local/bin/livestreamer
  • ffprobe
  • ffmpeg

You can do this by editing your shell profile or by prefixing PATH=$PATH:~/.local/bin:~/bin/ to commands.

For Twitch streams, Livestreamer will require an Client-ID or OAUTH token. OAUTH token can be specified in the config file. You can generate one using livestreamer --twitch-oauth-authenticate. (Keep your token secret!)

Ensure Redis is not exposed to the internet by checking /etc/redis/6379.conf. By default on Debian/Ubuntu, it uses bind 127.0.0.1.

Finally, grab TPPOCR from git:

    git clone https://URL_TO_GITHUB_HERE/USERNAME/tppocr

Since TPPOCR is meant to run as a bunch of scripts, it does not currently have an install file.

Usage

The basic structure of the command to start the OCR process is:

    python3 -m tppocr config.ini

In addition, the command may need extra environment variables. For example, if tppocr is the current directory:

    PYTHONPATH=./ TESSDATA_PREFIX=./ python3 -m tppocr config.ini
  • PYTHONPATH is the directory of the tppocr project directory. It should contain the tppocr package directory.
  • TESSDATA_PREFIX is directory containing the tessdata directory. tessdata contains the TPPOCR training data files.

See the example configuration files for details on setting them.

To run the web interface, install Tornado and run:

    pip3 install tornado --user
    python3 -m tppocr.web

Add --help to see available settings. If you want to expose this to the Internet, run it behind a web server with websocket support. Tornado has suggestions here. Nginx config to enable websocket is described here.

To save the data, you can use the following:

    python3 -m tppocr.pub.textfile log_dir/

Standalone

If you simply want to use the training data with Tesseract, copy the traineddata file into the Tesseract data directory.

Or you can run it by specifying the project directory. For example, to read a cropped image of a timestamp:

    tesseract --tessdata-dir ~/Documents/tppocr/tessdata/ -l pkmngba_en screenshot_cropped.jpg stdout /usr/share/tesseract-ocr/tessdata/configs/digits