This repository provides the code to deploy Post Quantized mobilnetv2 model on Google Kubernetes Engine.
- flask
- Dockerfile [Dockerfile the dependencies to run gunicorn+flask based application server for predicting the POST request image]
- deploy.py [Python code for running guincorn]
- FlaskApp.py [Python code for preprocessing the input image and predicting it with post quantized mobilenetv2 model]
- MobileNetV2Quantized.pth [Is the torchscript based serialized post quantized mobilenetv2 model]
- deployment.yaml [YAML file to create pods using replicaset]
- service.yaml [YAML file to create a service (type loadbalancer) for external word to reach this application service]
gcloud container clusters create cluster-1 --zone us-central1-a
gcloud container clusters get-credentials cluster-1 --zone us-central1-a
kubectl get nodes
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Note
- Selector for service should be the label app given for pods
- name in deployment, service are basically to identify it.