/classify-handwritten-characters

Classify handwritten Chinese characters

Primary LanguagePureBasicMIT LicenseMIT

classify-handwritten-characters

Classify handwritten Chinese characters

Project Status

  • Model can be created with ~20% accuracy. Not so good yet.
  • Works with the old-style "-c.gnt" files, but fails with "-f.gnt", not sure why yet
  • Predictions can be made with the python code
  • Android app uses binaries that no longer exist, so will need to be rewritten

Prerequisites for training the model

Setting up the python environment

  1. brew install openblas -- required for scikit-image to build correctly
  2. brew install hdf5 -- required for installing tensorflow-macos
  3. python3 -m venv env
  4. source env/bin/activate
  5. python3 -m pip install -r requirements.txt

Predicting something a trained model

  1. Predict something with python predict.py trained_model.tf png_image_sample

Recreating the characters.index:

Note -- you shouldn't need to do this unless something has gone horribly wrong.

  1. python create_character_index.py training_dir1 dir2 dir3...

Creating the model

  1. Unzip the training and test sets
  2. python create_records.py hwdb.train.tfrecords training_dir1 training_dir2 training_dir3...
  3. python create_records.py hwdb.test.tfrecords test_dir1 test_dir2 test_dir3...
  4. python train_model.py

Note that the model is saved (and overwritten) after every epoch.

Copying the new model into the android app

  1. ./update_android_app_model.sh

TODO

  • Revive and update model for Tensorflow 2.7 (Dec 2021)
  • Make the new-style "-c.gnt" files work
  • Remove the old git lfs pre-trained models
  • Revive and update Android app
  • Better comment and document python code
  • Send drawings to server for better training data
  • Add second model for processing characters that are composed of strokes