KTH/devops-course

AIOps / MLOps / Infrastructure and software engineering for ML

monperrus opened this issue · 35 comments

https://github.com/machine-learning-apps/actions-ml-cicd
A Collection of GitHub Actions That Facilitate MLOps

Machine learning operations with GitHub Actions and Kubernetes - GitHub Universe 2019
https://www.youtube.com/watch?v=Ll50l3fsoYs

TinyMLOps: Operational Challenges for Widespread Edge AI Adoption https://arxiv.org/abs/2203.10923

Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream processing
https://beam.apache.org/

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.
https://www.kubeflow.org/

Tensorboard A suite of visualization tools to understand, debug, and optimize TensorFlow programs for ML experimentation
https://www.tensorflow.org/tensorboard

"In the coming decade, all software development will be assisted by AI. Either the code is going to be generated with the help of AI, or it is going to be reviewed by AI, tested by AI, or even deployed by AI."
https://www.tabnine.com/blog/from-ci-to-ai-the-ai-layer-in-your-organization/
https://youtu.be/6YQX0LGaNy8

Quality Assurance in MLOps Setting: An Industrial Perspective.
http://arxiv.org/abs/2211.12706

Edge Impulse: An MLOps Platform for Tiny Machine Learning
http://arxiv.org/abs/2212.03332

Edge Impulse: An MLOps Platform for Tiny Machine Learning.
http://arxiv.org/pdf/2212.03332

A Data Source Dependency Analysis Framework for Large Scale Data Science Projects.
http://arxiv.org/abs/2212.07951

The Pipeline for the Continuous Development of Artificial Intelligence Models -- Current State of Research and Practice.

http://arxiv.org/abs/2301.09001

Open Source Feature Store for Production ML
https://feast.dev/

seldon-core: An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
https://github.com/SeldonIO/seldon-core

MLOps in google cloud with Vertex AI: Orchestrate machine learning (ML) workflows using Vertex AI Pipelines.

https://cloud.google.com/vertex-ai/docs/pipelines

LLMOps: Research and technology for building AI products w/ foundation models.
General technology for enabling AI capabilities w/ (M)LLMs: MiniLLM (LLM Distillation), LLM Accelerator, Structured Prompting, Extensible Prompts, and Promptist.
Effective and efficient approaches to deploying large AI models in practice: MiniLM(-2), xTune, EdgeFormer, and Aggressive Decoding

https://thegenerality.com/agi/about.html

Kserve Standardized Serverless ML Inference Platform on Kubernetes
https://github.com/kserve/kserve

Neptune: Track, compare, and share your models in one place
https://neptune.ai/

DVC: ML Experiments Management with Git

Amazon SageMaker

Build, train, and deploy machine learning (ML) models with Amazon infrastructure, tools, and workflows.

https://aws.amazon.com/sagemaker/

run-house: Iterate and deploy AI workloads on your own infra. Unobtrusive, debuggable, PyTorch-like APIs
https://github.com/run-house/runhouse/

Langfuse - LLM engineering platform for model tracing, prompt management, and application evaluation. Langfuse helps teams collaboratively debug, analyze, and iterate on their LLM applications such as chatbots or AI agents.