Shreyansh228's Stars
databricks/scala-style-guide
Databricks Scala Coding Style Guide
holdenk/spark-testing-base
Base classes to use when writing tests with Spark
goto/compass
Metadata storage service
moyano83/High-Performance-Spark
danielbeach/data-engineering-practice
Data Engineering Practice Problems
caraml-dev/merlin
Kubernetes-friendly ML model management, deployment, and serving.
brianfrankcooper/YCSB
Yahoo! Cloud Serving Benchmark
ibis-project/ibis
the portable Python dataframe library
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
inovatrend/mtc-demo
Demo for multithreaded usage of KafkaConsumer
felipegutierrez/explore-flink
This project uses Apache Flink as a stream engine that consumes data from the File system or Kafka brokers and exposes metrics using Prometheus and Grafana, everything deployed on Kubernetes (minikube).
NashTech-Labs/spark-assignment-2
Solution of Apache Spark assignment 2 given during KIP 2018 sessions.
uber-common/jvm-profiler
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
VolodymyrGavrysh/My_RoadMap_Data_Science
own way of studying data science, machine learning and AI (Python)
dttung2905/flink-at-scale
📚 Tech blogs & talks by companies that run Apache Flink in production
ververica/flink-training
Apache Flink Training Excercises
ververica/lab-flink-latency
Lab for testing different Flink job latency optimization techniques covered in a Flink Forward 2021 talk
ByteByteGoHq/system-design-101
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
vinclv/data-engineering-minds-kafka
This repository contains the components that I use for my Youtube Kafka videos
raystack/dagger
Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.