🎯 The GitOps Platform for Data Analytics utilizes Kubernetes (K8s) and HashiCorp's Terraform Infrastructure as Code (IaC) on the AWS Cloud 🌥️, offering speed, scalability, agility, and cost efficiency. ⚡
The diagram below showcases the wide array of open-source data tools, Kubernetes operators, and frameworks supported by DoK8s. It also highlights the seamless integration of Data Analytics managed services with the powerful capabilities of DoK8s open-source tools: reusable, composable, configurable.
Data on K8s (DoK8s) solution is categorized into the following focus areas.
- 🎯 Data Analytics on K8s
- 🎯 AI/ML on K8s
- 🎯 Streaming Platforms on K8s
- 🎯 Scheduler Workflow Platforms on K8s
- 🎯 Distributed Databases & Query Engine on K8s
- 🚀 Reproducible Local Development with Dev Containers: VSCode, K8s, TF, Python/R
- data-engineering-python: Docker + VScode + Python = ❤️
- 🚀 JupyterHub on EKS 👈 This blueprint deploys a self-managed JupyterHub on EKS with Amazon Cognito authentication.
- 🚀 Spark Operator with Apache YuniKorn on EKS 👈 This blueprint deploys EKS cluster and uses Spark Operator and Apache YuniKorn for running self-managed Spark jobs
- 🚀 Self-managed Airflow on EKS 👈 This blueprint sets up a self-managed Apache Airflow on an Amazon EKS cluster, following best practices.
- 🚀 Argo Workflows on EKS 👈 This blueprint sets up a self-managed Argo Workflow on an Amazon EKS cluster, following best practices.
- 🚀 Kafka on EKS 👈 This blueprint deploys a self-managed Kafka on EKS using the popular Strimzi Kafka operator.
Built with ❤️ at AWS 🌥️ K8s 🌟 Terraform 🚀.