mmarich1's Stars
public-apis/public-apis
A collective list of free APIs
kamranahmedse/developer-roadmap
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
DataTalksClub/mlops-zoomcamp
Free MLOps course from DataTalks.Club
great-expectations/great_expectations
Always know what to expect from your data.
googleapis/google-api-python-client
🐍 The official Python client library for Google's discovery based APIs.
git-ecosystem/git-credential-manager
Secure, cross-platform Git credential storage with authentication to GitHub, Azure Repos, and other popular Git hosting services.
shobrook/rebound
Get Stack Overflow results in your terminal whenever an error is thrown
toddwschneider/nyc-taxi-data
Import public NYC taxi and for-hire vehicle (Uber, Lyft) trip data into a PostgreSQL or ClickHouse database
danielbeach/data-engineering-practice
Data Engineering Practice Problems
graphql-python/gql
A GraphQL client in Python
ubahnverleih/WoBike
Documentation of Bike Sharing APIs 🚴🛴🛵
samapriya/awesome-gee-community-datasets
Community Datasets added by users and made available for use at large
facebookresearch/bitsandbytes
Library for 8-bit optimizers and quantization routines.
bytewax/awesome-public-real-time-datasets
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
astronomer/astro-cli
CLI that makes it easy to create, test and deploy Airflow DAGs to Astronomer
eakmanrq/sqlframe
Turning PySpark Into a Universal DataFrame API
autometrics-dev/autometrics-py
Easily add metrics to your code that actually help you spot and debug issues in production. Built on Prometheus and OpenTelemetry.
aws/amazon-redshift-python-driver
Redshift Python Connector. It supports Python Database API Specification v2.0.
canimus/cuallee
Possibly the fastest DataFrame-agnostic quality check library in town.
josephmachado/data_engineering_best_practices
Sample project to demonstrate data engineering best practices
astronomer/airflow-dbt-demo
A repository of sample code to accompany our blog post on Airflow and dbt.
axel-sirota/productionalizing-data-pipelines-airflow
Productionalizing Data Pipelines with Apache Airflow
borjavb/dbt-iceberg-poc
darioradecic/python-1-billion-row-challenge
benniehaelen/delta-lake-up-and-running
Companion repository for the book 'Delta Lake Up and Running'
danb-neo4j/patient_journey
GDS Patient Journey Demo
kafkanetes/minikan
Small scale Kafka cluster setup on kubernetes - perfect for functionality testing and development
Williamdst/Bike-Share-USA
The objective of the project is to build a model that can predict the number of bike share stations that should be created in zip codes that surround a company’s current network of bike stations. They could then use the model as a guide for an expansion effort.
lucapug/nyc-bike-analytics
capstone project for DTC DEzoomcamp 2024