akg003's Stars
karanpratapsingh/system-design
Learn how to design systems at scale and prepare for system design interviews
donnemartin/interactive-coding-challenges
120+ interactive Python coding interview challenges (algorithms and data structures). Includes Anki flashcards.
DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
nsqio/nsq
A realtime distributed messaging platform
hyperledger/fabric
Hyperledger Fabric is an enterprise-grade permissioned distributed ledger framework for developing solutions and applications. Its modular and versatile design satisfies a broad range of industry use cases. It offers a unique approach to consensus that enables performance at scale while preserving privacy.
donnemartin/awesome-aws
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
theanalyst/awesome-distributed-systems
A curated list to learn about distributed systems
priyankavergadia/google-cloud-4-words
The Google Cloud Developer's Cheat Sheet
getmoto/moto
A library that allows you to easily mock out tests based on AWS infrastructure.
aws/aws-sdk-pandas
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
awslabs/amazon-redshift-utils
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
datawithdanny/sql-masterclass
aws-samples/aws-glue-samples
AWS Glue code samples
ajbosco/dag-factory
Dynamically generate Apache Airflow DAGs from YAML configuration files
shawlu95/Beyond-LeetCode-SQL
Analysis of SQL Leetcode and classic interview questions, common pitfalls, anti-patterns and handy tricks. Sample databases.
benjaminp/six
Python 2 and 3 compatibility library
cdapio/cdap
An open source framework for building data analytic applications.
awsdocs/aws-glue-developer-guide
The open source version of the AWS Glue docs. You can submit feedback & requests for changes by submitting issues in this repo or by making proposed changes & submitting a pull request.
spirom/spark-streaming-with-kafka
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
aws/aws-emr-best-practices
A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices across Spark, Hive, Hudi, Hbase and more.
NIteshx2/AdvancedSQL_Interview
Analysis of SQL Leetcode and classic interview questions. Common pitfalls, anti-patterns and handy tricks are discussed. Sample databases are provided.
akashmehta10/profiling_pyspark
GigahexHQ/console
Open source data infrastructure platform. Designed for developers, built for speed.
akashmehta10/cdc_pyspark_hive
NeerajBhadani/spark-streaming
This repository contains code for Spark Streaming
avensolutions/cdc-at-scale-using-spark
Scalable CDC Pattern Implemented using PySpark
deesebc/PostExamples
A repository that includes examples from Spanish posts
MatdeB-SL/Spark-Performance---Cycle-Hire-Data
ShafiqaIqbal/SFTP-S3-Glue-Ingestion-Python
Glue Batch ingestion job to move files from file server to S3
ghoshm21/spark_book
book pdf and codes