k8s

This repository provides the code to deploy Post Quantized mobilnetv2 model on Google Kubernetes Engine.

Code Architecture

flask
1. Dockerfile [Dockerfile the dependencies to run gunicorn+flask based application server for predicting the POST request image]
2. deploy.py [Python code for running guincorn]
3. FlaskApp.py [Python code for preprocessing the input image and predicting it with post quantized mobilenetv2 model]
4. MobileNetV2Quantized.pth [Is the torchscript based serialized post quantized mobilenetv2 model]
deployment.yaml [YAML file to create pods using replicaset]
service.yaml [YAML file to create a service (type loadbalancer) for external word to reach this application service]

gcloud container clusters create cluster-1 --zone us-central1-a

gcloud container clusters get-credentials cluster-1 --zone us-central1-a

kubectl get nodes

kubectl apply -f deployment.yaml

kubectl apply -f service.yaml

Note

Selector for service should be the label app given for pods
name in deployment, service are basically to identify it.

bigvisionai/k8s-deployment

k8s

Code Architecture