This project aims to show how to use PyTorch to recognize handwritten characters.
- We first clone this repo
$ git clone https://github.com/character_recognizer_pytorch
$ cd character_recognizer_pytorch
- Create a new environment with conda and install requirements
$ conda create --name character_recognizer_pytorch python=3.8
$ conda activate character_recognizer_pytorch
$ pip install -r requirements.txt
We can run an experiment, for now, using two public datasets:
- MNIST: MNIST stands for Mini-NIST, where NIST is the National Institute of Standards and Technology, which compiled a dataset of handwritten digits and letters in the 1980s. MNIST is Mini because it only included digits.
- EMNIST: EMNIST is a repackaging of the original dataset, which also includes letters, but presented in the popularized MNIST format. You can see a publication about it here.
After installing the libraries, use python to run run_experiment.py
$ export PYTHONPATH=. # Defines root as python workdir
$ python training/run_experiment.py --max_epochs=5 --gpus=0 --num_workers=20 --model_class=MLP --data_class=MNIST
Where:
- --max_epochs: Controls the number of complete passes through the training dataset.
- --gpus: ID of GPU used to train the model. 0 stands for
CPU
and 1 forGPU
or -1 touse all GPUs available
. - --num_workers: How many subprocesses to use for data loading. 0 means that the data will be loaded in the main process.
- --model_class: Class for modeling available in
model_core.models
. Contains the Neural Network Architecture. - --data_class: Class for downloading and preparing data available in
model_core.data
. Contains all steps to download and preprocess data. It can beMNIST
orEMNIST
.
P.S.: Project based on labs from Spring 2021 Full Stack Deep Learning Course