/aws-automlops-serverless-deployment

Serverless framework deployment for various lambda functions to achieve MLOps level 2 within AWS.

AWS AutoMLOps Serverless Deployment

🧩 Serverless deployment to AWS 💥 Serverless remove resources

This repository contains the serverless deployment yaml for lambda functions used to achieve MLOps Level 2 within AWS. Not all resources are created by this deployment in AWS, some resources are created via Terraform in this repository terraform-aws-machine-learning-pipeline repository.

The lambda functions deployed aim to automate, preparing data and transforming features, training and tuning, deploying models and running inferences etc.

Architecture

proposed-automlops-level-2

  1. User has received new data.
  2. Data is uploaded to a GitHub repository.
  3. A GitHub action is triggered uploading the data to a S3 Bucket.
  4. Lambda function for data preprocessing is run due to an event trigger on the bucket.
  5. Upon completion the transformed data is uploaded to another bucket.
  6. Lambda function will trigger the SageMaker training job with various hyperparameters.
  7. Training job is started using data split for training and validation.
  8. Completed model is uploaded to a S3 Bucket.
  9. Lambda function to deploy the new model for inference using serverless endpoint.
  10. Message is sent to queue containing endpoint name and test data location.
  11. Lambda function will invoke serverless endpoint with test data.
  12. Results of predictions stored in a S3 bucket for Data scientist to examine.

Lambda repositories

The source code for all lambda functions are stored in GitHub:

GitHub Action (CI/CD)

The GitHub Action will deploy the lambda functions using the serverless action. Docker images used for deployment are stored within an AWS ECR repository.

Note

This repository is to serve as an example, the architecture provided may not apply to all use cases due to the limitations of lambdas. Please consider other AWS services, for instance Elastic Container Service (ECS), for much better performance and longer running tasks. As the MLOps workflow has been split into various components, it should be easy to identify areas that could benefit to being moved to a different service with better compute.