Babadook007's Stars
guipsamora/pandas_exercises
Practice your pandas skills!
codebasics/py
Repository to store sample python programs for python learning
Jeevan-kumar-Raj/Grokking-System-Design
Systems design is the process of defining the architecture, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems theory to product development.
ajcr/100-pandas-puzzles
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
AlexIoannides/pyspark-example-project
Implementing best practices for PySpark ETL jobs and applications.
holdenk/spark-testing-base
Base classes to use when writing tests with Spark
fbaptiste/python-deepdive
Python Deep Dive Course - Accompanying Materials
Puyodead1/udemy-downloader
A Udemy downloader that can download courses, with DRM support.
shawlu95/Beyond-LeetCode-SQL
Analysis of SQL Leetcode and classic interview questions, common pitfalls, anti-patterns and handy tricks. Sample databases.
ankurchavda/SparkLearning
A comprehensive Spark guide collated from multiple sources that can be referred to learn more about Spark or as an interview refresher.
YotpoLtd/metorikku
A simplified, lightweight ETL Framework based on Apache Spark
databricks/delta-live-tables-notebooks
fd4s/fs2-kafka
Functional Kafka Streams for Scala
zzhutianyu/educative.io_courses
this is downloadings of all educative.io free student subscription courses as pdf from GitHub student pack
microsoft/nutter
Testing framework for Databricks notebooks
vaquarkhan/Apache-Kafka-poc-and-notes
SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
LearningJournal/Kafka-Streams-Real-time-Stream-Processing
This is the central repository for all materials related to Kafka Streams : Real-time Stream Processing! Book by Prashant Pandey.
databricks/spark-integration-tests
Integration tests for Spark
vim89/datapipelines-essentials-python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
mayur2810/sope
Apache Spark ETL Utilities
Azure/Hadoop-Migrations
Hadoop Migrations to Azure
JoseRFJuniorLLMs/PySpark-ETL
PySpark-ETL
vanigupta20024/Programming-Challenges
Puyodead1/udemy-dl-drm
A cross-platform python based utility to download courses from udemy for personal offline use.
superaghu/LeetCodeLocally
Puyodead1/udemy-dl-go
A WIP Udemy downloader written in Go
jamesbak/recursive_acl
Recursive ACL assignment for Azure Data Lake Storage Gen2
Erickramirez/Sparkify-Data-Lake-with-Apache-Spark
This project has as output a Data Lake solution. It building an ETL pipeline that extracts their data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables. This will allow their analytics team to continue finding insights in what songs their users are listening to.
hitachisolutionsamerica/empower-hiring