YanicM's Stars
erezsh/reladiff
High-performance diffing of large datasets across databases
capitalone/datacompy
Pandas, Polars, and Spark DataFrame comparison for humans and more!
JoshData/python-email-validator
A robust email syntax and deliverability validation library for Python.
turntable-so/turntable
The open-source analytics development platform
PrefectHQ/actions-prefect-deploy
A GitHub Action for deploying a Prefect flow to Prefect Cloud
Hiflylabs/awesome-dbt
A curated list of awesome dbt resources
quarylabs/sqruff
Fast SQL formatter/linter
cube-js/cube
📊 Cube — The Semantic Layer for Building Data Applications
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
PicnicSupermarket/dbt-score
Linter for dbt metadata
BasedHardware/OpenGlass
Turn any glasses into AI-powered smart glasses
DataRecce/recce
The dbt data-validation toolkit for teams that care about building better data
astral-sh/uv
An extremely fast Python package and project manager, written in Rust.
charlax/professional-programming
A collection of learning resources for curious software engineers
charlax/python-education
Reading list for ramping up with professional Python
priyankavergadia/google-cloud-4-words
The Google Cloud Developer's Cheat Sheet
hack4impact/flask-base
A simple Flask boilerplate app with SQLAlchemy, Redis, User Authentication, and more.
oleg-agapov/data-engineering-book
Accumulated knowledge and experience in the field of Data Engineering
papers-we-love/papers-we-love
Papers from the computer science community to read and discuss.
DataTalksClub/awesome-data-podcasts
A list of awesome data podcasts
szagoruyko/pytorchviz
A small package to create visualizations of PyTorch execution graphs
dpgaspar/Flask-AppBuilder
Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Demo (login with guest/welcome) - http://flaskappbuilder.pythonanywhere.com/
quantifiedcode/python-anti-patterns
An open collection of Python anti-patterns and worst practices.
PacktPublishing/Transformers-for-Natural-Language-Processing
Transformers for Natural Language Processing, published by Packt
thunlp/PLMpapers
Must-read Papers on pre-trained language models.
Lightning-AI/pytorch-lightning
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
uberspot/OpenTriviaQA
A creative commons dataset of trivia questions and answers
taivop/joke-dataset
A dataset of 200k English plaintext jokes.
amoudgl/short-jokes-dataset
Python scripts for building 'Short Jokes' dataset, featured on Kaggle