Japanese Hiragana Recogniser

Handwritten Japanese Hiragana recognition using a deep convolutional neural network. My model follows the framework of VGGNet and it performs over 98% accuracy on the test set.

demo

This project is inspired by this thesis. My model outperforms the score presented in this thesis by around 2-3% in test accuracy thanks to data augmentation and other factors.

Model Configuration

Type	Size	Activation
Convolution	64 x 64	ReLU
Max Pooling	32 x 32	ReLU
Convolution	32 x 32	ReLU
Convolution	32 x 32	ReLU
Max Pooling	16 x 16	ReLU
Convolution	16 x 16	ReLU
Convolution	16 x 16	ReLU
Max Pooling	8 x 8	ReLU
Fully Connected	256	ReLU
Fully Connected	128	ReLU
Fully Connected (output)	70	softmax

Dropout layer is also applied to reduce overfitting.
Used Adam optimizer with default learning rate and beta values.
Applied early stopping for when test validation score doesn't improve for 3 epochs in a row.
Train data are augmented for better generalisation. (applied rotation and zooming)

Feature	Custom Sequential	VGGNet	VGGNet
Test Accuracy	<= 90%	<= 98%	<= 98.88%
Dataset	Kuzushiji MNIST	ELT-8	ELT-8
Data Augmentation	No	No	Yes

Although test accuracy doesn't really differ between model with augmented images and normal images, the performance on predicting user input's character seems to drastically improve. This is partially because the model is more flexible to how the character is written.

Future work

The character を is missing from the dataset.
Further fine-tune the model.
Feature engieering for stroke order (書き順) and number of strokes (画数).

Datasets

Primary

Dataset: ELT-8: ELTDB

Description: Classification of handwritten Japanese character, 72 classes (五十音順).

Training & Testing: 11.5k 128x127 instances.

Secondary

Dataset: Kuzushiji MNIST

Description: Classification of handwritten Japanese character, 49 classes (五十音順).

Training: 232k 28x28 images

Testing: 38k 28x28 images

This dataset did not work well as each instance was only 28 x 28 pixels image and this app takes 400 x 400 pixels image of handwritten Hiragana from the user. Resizing from 400 x 400 to 28 x 28 seems to lose significant amount of information.
Hence, my model performed reasonably well on the dataset (achieving over 90% accuracy on test set) but performance on the app wasn't great.

References

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

Tsai, Charlie. "Recognizing handwritten Japanese characters using deep convolutional neural networks." University of Stanford in Stanford, California (2016): 405-410.

森俊二、山本和彦、山田博三、斉藤泰一: “手書教育漢字のデータベースについて”, 「電総研彙報」, Vol.43, Nos.11&12, pp.752–773 (1979-11&12).

"KMNIST Dataset" (created by CODH), adapted from "Kuzushiji Dataset" (created by NIJL and others), doi:10.20676/00000341

FukiBabasaki/hiragana-recogniser