This repository is designed to provide a comprehensive ML infrastructure for CTR (Click-Through Rate) prediction.
With a focus on AWS services, this repository offer practical learning experience for MLOps.
We guide you through setting up a Python development environment that ensures code quality and maintainability.
This environment is carefully configured to enable efficient development practices and facilitate collaboration.
This repository includes the implementation of a training pipeline.
This pipeline covers the stages, including data preprocessing, model training, and evaluation.
This repository provides an implementation of a prediction server that serves predictions based on your trained CTR prediction model.
To showcase industry-standard practices, this repository guide you in deploying the training pipeline and inference server on AWS.
AWS Infra Architecture made by this repository.
Software | Install (Mac) |
---|---|
pyenv | brew install pyenv |
Poetry | curl -sSL https://install.python-poetry.org | python3 - |
direnv | brew install direnv |
Terraform | brew install terraform |
Docker | install via dmg |
awscli | curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg" |
Use pyenv
to install Python 3.9.0 environment
$ pyenv install 3.9.0
$ pyenv local 3.9.0
Use poetry
to install library dependencies
$ poetry install
Use direnv
to configure environment variable
$ cp .env.example .env
$ direnv allow .
Set your environment variable setting
AWS_REGION=
AWS_ACCOUNT_ID=
AWS_PROFILE=
AWS_BUCKET=
AWS_ALB_DNS=
USER_NAME=
VERSION=2023-05-11
MODEL=sgd_classifier_ctr_model
move current directory to infra
$ cd infra
Use terraform to create aws resources.
Apply terraform
$ terraform init
$ terraform apply
unzip train data
$ unzip train_data.zip
upload train data to S3
$ aws s3 mv train_data s3://$AWS_BUCKET
Tool | Usage |
---|---|
isort | library import statement check |
black | format code style |
flake8 | code quality check |
mypy | static type checking |
pysen | manage static analysis tool |
Build ML Pipeline
$ make build-ml
Run ML Pipeline
$ make run-ml
Build Predict API
$ make build-predictor
Run Predict API locally
$ make up
Shutdown Predict API locally
$ make down
Run formatter
$ make format
Run linter
$ make lint
Run pytest
$ make test