This repository contains a full-stack deep learning project focused on Chinese MNIST handwritten digit recognition. The project utilizes the Chinese MNIST dataset, performs data processing, and implements a Convolutional Neural Network (CNN) model for training. The trained model is then integrated into a frontend application and a backend system deployed in Docker.
requirements/
- dependenciestrain.ipynb
- notebook for model trainingdeliver.ipynb
- notebook for model deploymentdata/
- raw and processed datafrontend/
- front-end codebackend/
- backend codetext_recognizer/
- model deployment codetraining/
- training code
The workflow of the project is broken down into two main parts: training the model and delivering the model.
This part is handled in train.ipynb
and it consists of the following steps:
-
Downloading the Chinese MNIST dataset: The dataset is obtained from Kaggle and downloaded into the project for further processing.
-
Data processing: After downloading, the data is processed and prepared for the training process.
-
Model training: A CNN model is trained on the processed data. This training happens both offline and online. The online training saves the trained model (artifact) in Weights & Biases (wandb).
The delivery of the model to the front-end and backend systems is covered in deliver.ipynb
and it involves:
-
Checkpoint conversion: The saved model from wandb is converted into a TorchScript for better compatibility and efficiency.
-
Frontend build: The deep learning model is integrated into the frontend system using Gradio. This allows the model to be accessed either locally or from another machine, providing a user-friendly interface for model interaction.
-
Backend build: The backend system is set up using AWS Lambda, which serve predictions only when a request hits. The model and its associated application are containerized and managed using Docker, ensuring a smooth migration and deployment process.
The project was developed based on this repository.
The Chinese MNIST data used in this project can be found on Kaggle.
Before running the notebooks, please install the required Python packages by running:
pip install -r requirements/prod.txt
To train the model, run the train.ipynb
notebook.
For model deployment, follow the steps provided in deliver.ipynb
.