gchatzip's Stars
sindresorhus/awesome
😎 Awesome lists about all kinds of interesting topics
open-guides/og-aws
📙 Amazon Web Services — a practical guide
afshinea/stanford-cs-229-machine-learning
VIP cheatsheets for Stanford's CS 229 Machine Learning
mahmoud/awesome-python-applications
💿 Free software that works great, and also happens to be open-source Python.
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
zziz/pwc
This repository is no longer maintained.
Miserlou/Zappa
Serverless Python
miguelgrinberg/flasky
Companion code to my O'Reilly book "Flask Web Development", second edition.
yzhao062/pyod
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
jupyter/docker-stacks
Ready-to-run Docker images containing Jupyter applications
delta-io/delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
ept/ddia-references
Literature references for “Designing Data-Intensive Applications”
puckel/docker-airflow
Docker Apache Airflow
databricks/koalas
Koalas: pandas API on Apache Spark
awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
kubeflow/spark-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
pynamodb/PynamoDB
A pythonic interface to Amazon's DynamoDB
tpolecat/doobie
Functional JDBC layer for Scala.
apenwarr/redo
Smaller, easier, more powerful, and more reliable than make. An implementation of djb's redo.
linkedin/dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
pyjanitor-devs/pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
qubole/sparklens
Qubole Sparklens tool for performance tuning Apache Spark
markdrago/pgsanity
Check syntax of postgresql sql files
shawnbrown/datatest
Tools for test driven data-wrangling and data validation.
awslabs/amazon-redshift-monitoring
Amazon Redshift Advanced Monitoring
uwescience/myria
Myria is a scalable Analytics-as-a-Service platform based on relational algebra.
dbt-labs/redshift
Redshift package for dbt (getdbt.com)
BigDataAnalyticsGroup/bigdataengineering
Educational material for big data engineering courses
TomMalkin/SimQLe
The simplest way to use SQL in Python
uwescience/myria-python
Python utilities for exercising Myria's REST API