This project demonstrates the use of TensorFlow and TensorFlow Model Optimization Toolkit to train, optimize, and evaluate a deep learning model on the CIFAR-10 dataset. The primary focus is on implementing and comparing different model optimization techniques, including quantization and pruning, for efficient deployment in resource-constrained environments like edge devices.
To run this project, you need access to Google Colab and a Google Drive account for model storage. The primary dependencies include:
- TensorFlow
- TensorFlow Datasets
- TensorFlow Model Optimization Toolkit
- NumPy
Mounting Google Drive in the Colab notebook is required for saving and loading the models.
The CIFAR-10 dataset, a collection of 60,000 32x32 color images in 10 classes, is loaded and preprocessed. The preprocessing steps involve resizing the images to 160x160 and normalizing pixel values.
The MobileNetV2 architecture is used as the base model. Its layers are frozen to leverage transfer learning, and a global average pooling layer followed by a dense layer with softmax activation is added on top. The model is compiled and trained on the preprocessed CIFAR-10 dataset.
The trained model is converted to the TensorFlow Lite format with quantization. Quantization reduces the model size and is suitable for deployment on devices with limited resources.
The model is also optimized using pruning, a technique that systematically removes weights from the model. A pruning schedule is defined to balance the model's size and performance. The pruning process is applied to the model, and the pruned model is retrained.
A TensorFlow Lite interpreter is set up to evaluate the quantized model's performance. The test dataset is processed and used to measure the model's accuracy.
After optimizing and evaluating the models, the following results were obtained:
- Original Model Accuracy: 81.47%
- Quantized Model Accuracy: 79.44%
- Pruned Model Accuracy: 53.81%
These results illustrate the trade-offs between model complexity and performance:
- The original model shows the highest accuracy, which is expected as it retains all its parameters and complexity.
- The quantized model demonstrates a slight decrease in accuracy. This minor reduction is a favorable outcome, considering the significant benefits in terms of reduced model size and faster inference, making it suitable for edge devices.
- The pruned model shows a substantial decrease in accuracy. This suggests that the pruning might have been too aggressive, leading to a loss of important features necessary for making accurate predictions. It indicates a need to fine-tune the pruning process, balancing model size reduction with performance retention.
To execute the project, follow these steps:
- Mount Google Drive in Google Colab.
- Run the cells in sequence, starting from data loading and preprocessing, followed by model training, optimization, and evaluation.
- Observe the output at each stage, especially the accuracy metrics for each model.
This project demonstrates the effectiveness of model optimization techniques in preparing models for edge deployment. Future work could explore combining pruning and quantization, experimenting with different architectures, and deploying the optimized models on actual edge devices.