skr-learn's Stars
adidas/lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
airscholar/modern-data-eng-dbt-databricks-azure
In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider.
nordquant/complete-dbt-bootcamp-zero-to-hero
Supplementary Materials for the The Complete dbt (Data Build Tool) Bootcamp Udemy course
abdkumar/spotify-stream-analytics
Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consumes and processes Kafka data, saving it to the Datalake. Airflow orchestrates the pipeline. dbt moves data to Snowflake, transforms it, and creates dashboards.
Pavanpawar2705/Ingesting-Real-time-Logistics-Data-in-MongoDB-with-Kafka-and-Python
In this project, I am crafting an innovative solution that involves a Kafka producer and consumer, employing data serialization/deserialization with Avro, and orchestrating the smooth ingestion of data into MongoDB. 🌐⚙️ The goal? Empowering data-driven decisions and ensuring real-time insights into logistics operations. 🌐📈
Lal4Tech/Data-Engineering-With-AWS
Resources and projects from Udacity Data Engineering with AWS nano degree programme
gajerabhavik915/DSA
JagadeeshwaranM/Data_Engineering_Simplified
Satvik26/spotahome_ETL
ABZ-Aaron/Reddit-API-Pipeline
DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
TheAlgorithms/Python
All Algorithms implemented in Python
Rishav273/kafkaPysparkAnalytics
Real-time ETL pipeline for financial data (kafka, pyspark) .
AkashSingh3031/The-Complete-FAANG-Preparation
This repository contains all the DSA (Data-Structures, Algorithms, 450 DSA by Love Babbar Bhaiya, FAANG Questions), Technical Subjects (OS + DBMS + SQL + CN + OOPs) Theory+Questions, FAANG Interview questions, and Miscellaneous Stuff (Programming MCQs, Puzzles, Aptitude, Reasoning). The Programming languages used for demonstration are C++, Python, and Java.
cM2908/leetcode-spark
Contains spark dataframe solutions of leetcode questions
siddheshkankal/MySql
cM2908/leetcode-sql
Leetcode SQL Solutions
siddheshkankal/Hive_Challenge
siddheshkankal/Hive_Hbase_connection
siddheshkankal/Confluent_Kafka
desaikun1996/New-York-City-Arrests-Data-Modelling-Analysis-and-Visualization
Analysis of New York State Police Department Arrests dataset. Created Dimensional Model for the provided dataset. Using Alteryx and Talend, built ETL pipelines to process, clean the data and create dimensions and facts in the destination database. Further, visualized the necessary details of the database using Tableau and PowerBI.
NagarajuNakka/Hive-Class
NagarajuNakka/ineuron_kafka_assignment
NagarajuNakka/hive_assignment
NagarajuNakka/Python-Assignment
Ajay026/SQL-Project-for-Data-Analysis-part-1-7
Complete SQL Project for data analysis with source code.
bigdatabysumitm/NotesOfYouTubeSQLSeries
martandsingh/ApacheSpark
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
itversity/data-engineering-spark
commit-live-students/Data_Science_Masters_Program_2021