Pinned Repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
aws-etl-orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
aws-lambda-cheatsheet
AWS Lambda cheatsheet.
bonobo-sqlalchemy
PREVIEW - SQL databases in Bonobo, using sqlalchemy
butterfree
A tool for building feature stores.
content-aws-database-specialty
For AWS Database Specialty Course
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
etl
simple ETL example
Nishu9999's Repositories
Nishu9999/snowflake-connector-python
Snowflake Connector for Python
Nishu9999/stellar-etl-airflow
Airflow DAGs for the Stellar ETL project
Nishu9999/content-aws-database-specialty
For AWS Database Specialty Course
Nishu9999/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Nishu9999/butterfree
A tool for building feature stores.
Nishu9999/pyetl
python ETL framework
Nishu9999/spark
Apache Spark - A unified analytics engine for large-scale data processing
Nishu9999/snowplow
Cloud-native web, mobile and event analytics, running on AWS and GCP
Nishu9999/sqlbucket
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
Nishu9999/Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
Nishu9999/stetl
Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Nishu9999/pyspark-examples
Code examples on Apache Spark using python
Nishu9999/datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Nishu9999/jupyter-python-nbk
Nishu9999/etl
simple ETL example
Nishu9999/Zero-to-Snowflake
Get started scripts with Snowflake - Build for the Cloud Data Warehouse
Nishu9999/pyetllib
Common tools for ETL scripting
Nishu9999/pyspark-tutorial-2
Jupyter notebooks for pyspark tutorials given at the university
Nishu9999/bonobo-sqlalchemy
PREVIEW - SQL databases in Bonobo, using sqlalchemy
Nishu9999/goodreads_etl_pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Nishu9999/Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Nishu9999/aws-etl-orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Nishu9999/PySpark-ETL
PySpark-ETL
Nishu9999/pyspark-tutorial
PySpark Code for Hands-on Learners
Nishu9999/spark-select
A library for Spark DataFrame using MinIO Select API
Nishu9999/pyspark-etl-analytics
This repo contains code examples of processing and analysing data with Apache Spark and Python
Nishu9999/Spark_Packaged_project
This project contains pyspark jobs to create data pipelines and shows how to distribute the project package on Cluster.
Nishu9999/etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Nishu9999/aws-lambda-cheatsheet
AWS Lambda cheatsheet.
Nishu9999/snowflake-data-consortium
A test project for managing a peer-to-peer data sharing network in Snowflake