Chinese-Character-Recognition

A repo for machine-learning powered image classification of handwritten Chinese characters

Website

Full release date: July 4, 2021

November 10, 2021 Update: Enabled writing the characters on mobile devices.

Novermber 30, 2021 Update: Implemented visitor map for website.

Convert Keras Model To Tensorflow.js for Web App

!pip install tensorflowjs

!tensorflowjs_converter --input_format keras "model.h5" "docs/model"

Packages Used

TensorFlow, Numpy, Pandas, Matplotlib, CV2, OS

Data

Found here

For this data 100 people wrote each of the 15 characters 10 times

`suite_id`

suite_id is for each volunteer (100 total)

`sample_id`

sample_id is for each sample of each volunteer (10 total) i.e. each volunteer writes each character 10 times

`code`

code is used to identify each character in their sequence i.e. code is the ith character in order

i.e. 零 is 1, 一 is 2, 二 is 3, ... 九 is 10

`value`

value is the numerical value of the character i.e. 5

`character`

character is the actual symbol i.e. 五

Model Performance

Optimizer: ADAM

Loss function: Sparse Categorical Cross Entropy

Training Data (80%)

Epochs: 15

Loss: 0.0844

Accuracy: 0.9890 (98.90%)

Testing Data (20%)

Loss: 0.8035

Accuracy: 0.7753 (77.53%)

A Closer Look

Note: The outputs of these models are given in the range from 0 to 14. In order to convert to characters, you can use this convenient list.

characters = ('零', '一', '二', '三', '四', '五', '六', '七', '八', '九', '十', '百', '千', '万', '亿')

50th Test Image

Prediction data:

In [1]: PredictionData[50]

Out [1]: array([3.6610535e-07, 1.8792988e-28, 2.2422669e-12, 1.7063133e-10,
       2.6500999e-09, 5.2112355e-06, 3.2691182e-07, 8.3664847e-05,
       3.6446830e-13, 9.9990106e-01, 1.1595256e-14, 5.7882625e-07,
       9.6461701e-09, 3.3319463e-07, 8.4753528e-06], dtype=float32)

500th Test Image

Prediction data:

In [2]: PredictionData[500]

Out [2]: array([3.83269071e-04, 4.26350415e-21, 5.74980230e-10, 5.73274974e-07,
      4.58842493e-04, 8.47115181e-03, 6.79080131e-07, 1.44536525e-05,
      8.73805250e-10, 1.13052677e-03, 2.78494355e-07, 9.84949350e-01,
      2.79740023e-04, 4.30813758e-03, 2.93952530e-06], dtype=float32)

To Use

Option A

1) Load the models into your file

In your Python code, enter the following code to import the models.

import tensorflow as tf

model = tf.keras.models.load_model("model", compile=True)

The outputs of these models are given in the range from 0 to 14. In order to convert to characters, you can use this convenient list.

characters = ('零', '一', '二', '三', '四', '五', '六', '七', '八', '九', '十', '百', '千', '万', '亿')

You can then use this model in for other uses! Enjoy!

Option B

1) Download `archive.zip` and `ChinCharRecog.py`

2) Unpack `archive.zip` and ensure that its contents are in the same directory as `ChinCharRecog.py`

This should include the data folder, the chinese_mnist.csv csv file, and chinese_mnist.tfrecords. For future reference let's call this directory folder1.

3) Inside of `ChinCharRecog.py` change the variable `directory` on line 81

Change directory to the /folder1/data/data/ directory. On macOS, this might like something like /Users/user_name_here/Desktop/folder1/data/data/.

tyler-pruitt/Chinese-Character-Recognition

Chinese-Character-Recognition

Website

Convert Keras Model To Tensorflow.js for Web App

Packages Used

Data

`suite_id`

`sample_id`

`code`

`value`

`character`

Model Performance

Training Data (80%)

Testing Data (20%)

A Closer Look

50th Test Image

500th Test Image

To Use

Option A

1) Load the models into your file

Option B

1) Download `archive.zip` and `ChinCharRecog.py`

2) Unpack `archive.zip` and ensure that its contents are in the same directory as `ChinCharRecog.py`

3) Inside of `ChinCharRecog.py` change the variable `directory` on line 81

4) Run the file `ChinCharRecog.py` and enjoy!

tyler-pruitt/Chinese-Character-Recognition

Chinese-Character-Recognition

Website

Convert Keras Model To Tensorflow.js for Web App

Packages Used

Data

suite_id

sample_id

code

value

character

Model Performance

Training Data (80%)

Testing Data (20%)

A Closer Look

50th Test Image

500th Test Image

To Use

Option A

1) Load the models into your file

Option B

1) Download archive.zip and ChinCharRecog.py

2) Unpack archive.zip and ensure that its contents are in the same directory as ChinCharRecog.py

3) Inside of ChinCharRecog.py change the variable directory on line 81

4) Run the file ChinCharRecog.py and enjoy!

`suite_id`

`sample_id`

`code`

`value`

`character`

1) Download `archive.zip` and `ChinCharRecog.py`

2) Unpack `archive.zip` and ensure that its contents are in the same directory as `ChinCharRecog.py`

3) Inside of `ChinCharRecog.py` change the variable `directory` on line 81

4) Run the file `ChinCharRecog.py` and enjoy!