Pinned Repositories
amazon-kinesis-analytics-taxi-consumer
Sample Apache Flink application that can be deployed to Kinesis Analytics for Java. It reads taxi events from a Kinesis data stream, processes and aggregates them, and ingests the result to an Amazon Elasticsearch Service cluster for visualization with Kibana.
apache-kyuubi-for-emr-on-eks
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
arc
Arc is an opinionated framework for defining data pipelines which are predictable, repeatable and manageable.
aws-emr-best-practices
A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.
aws-emr-containers-best-practices
Best practices and recommendations for getting started with Amazon EMR on EKS.
aws-emr-utilities
aws-service-catalog-reference-architectures
Sample CloudFormation templates and architecture for AWS Service Catalog
emr-stream-demo
karpenter-emr-on-eks
sql-based-etl
melodyyangaws's Repositories
melodyyangaws/karpenter-emr-on-eks
melodyyangaws/emr-stream-demo
melodyyangaws/sql-based-etl
melodyyangaws/apache-kyuubi-for-emr-on-eks
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
melodyyangaws/arc
Arc is an opinionated framework for defining data pipelines which are predictable, repeatable and manageable.
melodyyangaws/aws-emr-best-practices
A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.
melodyyangaws/aws-emr-containers-best-practices
Best practices and recommendations for getting started with Amazon EMR on EKS.
melodyyangaws/aws-emr-utilities
melodyyangaws/data-engineering-for-aws-immersion-day
Lab Instructions for Data Engineering Immersion Day
melodyyangaws/deploy
Example deploy scripts to run Arc jobs on ephemeral cloud compute.
melodyyangaws/eks-spark-benchmark
Performance optimization for Spark running on Kubernetes
melodyyangaws/eks-workshop
AWS Workshop for Learning EKS
melodyyangaws/eksctl
The official CLI for Amazon EKS
melodyyangaws/emr-on-eks-benchmark
melodyyangaws/emr-on-eks-cost-tracking-solution
melodyyangaws/emr-on-eks-hdfs
melodyyangaws/emr-remote-shuffle-service
melodyyangaws/emr-serverless-samples
Example code for running Spark and Hive jobs on EMR Serverless.
melodyyangaws/emr-spark-benchmark
melodyyangaws/flink
Apache Flink
melodyyangaws/guidance-for-sql-based-etl-solution
A solution that provides declarative data processing capability, and workflow orchestration automation to help your business users (such as analysts and data scientists) access their data and create meaningful insights without the need for manual IT processes.
melodyyangaws/helm-chart-ranger-ldap
apache-ranger and ldap
melodyyangaws/hive-emr-on-eks
melodyyangaws/hive-metastore-chart
melodyyangaws/jam-challenge-spark-on-eks
melodyyangaws/kinesis-sql
Kinesis Connector for Structured Streaming
melodyyangaws/kubernetes-HDFS
Repository holding configuration files for running an HDFS cluster in Kubernetes
melodyyangaws/RemoteShuffleService
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
melodyyangaws/spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
melodyyangaws/spark-sql-perf