This repository is used to experiment and train machine learning models for predictions. The repository also includes methods for processing InkML files into normalized traces and images.
- Python 3.5
- Pip (To python 3.5)
In order to run the code in this repository, you will need either InkML files or an already generated dataset. There is also a complete folder with both xml and preprocessed data available. If this file is chosen, just extract all the files within the /online_recog/data
folder.
First clone the repository and run pip install -r requirements.txt
.
If you do not have access to the preprocessed dataset. You must first run the preprocessing on InkML files.
- Download the InkML folder
- Create folders such that the folder structure in online_recog is:
/online_recog/data/xml/
- Unzip the InkML files downloaded, and paste them into the xml folder.
- Open
/online_recog/keras_lstm.py
in a text editor. - Uncomment the line
generate_and_save_dataset()
- Run
python keras_lstm.py
If you have downloaded the preprocessed data. You will just have to put the data in the correct folder.
- Download the folder data.zip
- Create the folder
/online_recog/data
. - Extract the zipped files into the newly created
/data
folder.
After placing the datasets into the correct folder, the models can be ran by doing the following.
- Uncomment
load_dataset_and_run_model()
in/online_recog/keras_lstm_py
. - Choose the model you wish to train by changing
run_RNN_model()
to the method with the model you wish to train. - Run
python keras_lstm.py
.
In order to include real data for validation during training. You will either need to create data from symbol-predictor-server, or download the already processed files (the real data is included).
In /online_recog/keras_lstm.py
, there is a callback class which can be used to run validation and store logs from the training. This class, as well as a couple other places in keras_lstm.py
has commented out lines that can be uncommented if real data is available in /online_recog/data
. The regarding lines is shown in the keras_lstm.py
file.