xinnyuann's Stars
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
mhagiwara/100-nlp-papers
100 Must-Read NLP Papers
textvec/textvec
Text vectorization tool to outperform TFIDF for classification tasks
drabastomek/learningPySpark
Code base for the Learning PySpark book (in preparation)
ogozuacik/one-class-drift-detection
unsupervised concept drift detection with one-class classifiers
DataTalksClub/machine-learning-zoomcamp
Learn ML engineering for free in 4 months!
apache/pinot
Apache Pinot - A realtime distributed OLAP datastore
chiphuyen/machine-learning-systems-design
A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems"
GokuMohandas/mlops-course
Learn how to design, develop, deploy and iterate on production-grade ML applications.
logancyang/my-cs-degree
A CS degree with a focus on full-stack ML engineering, 2020
SeldonIO/alibi-detect
Algorithms for outlier, adversarial and drift detection
PacktPublishing/Data-Engineering-with-Apache-Spark-Delta-Lake-and-Lakehouse
Data Engineering with Spark and Delta Lake
great-expectations/great_expectations
Always know what to expect from your data.
awesomedata/awesome-public-datasets
A topic-centric list of HQ open datasets.
fivethirtyeight/data
Data and code behind the articles and graphics at FiveThirtyEight
CamDavidsonPilon/lifetimes
Lifetime value in Python
JWarmenhoven/ISLR-python
An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani, 2013): Python code
fastai/fastbook
The fastai book, published as Jupyter Notebooks
Greenstand/treetracker-machine-learning
Greenstand's data analysis repository
lazyprogrammer/machine_learning_examples
A collection of machine learning examples and tutorials.
giampaolo/psutil
Cross-platform lib for process and system monitoring in Python
datawhalechina/pumpkin-book
《机器学习》(西瓜书)公式详解
datapane/datapane
Build and share data reports in 100% Python
visgl/deck.gl
WebGL2 powered visualization framework
mrpowers-io/spark-daria
Essential Spark extensions and helper methods ✨😲
FilippoBovo/production-data-science
Production Data Science: a workflow for collaborative data science aimed at production
facebook/prophet
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
jphall663/awesome-machine-learning-interpretability
A curated list of awesome responsible machine learning resources.
sindresorhus/awesome
😎 Awesome lists about all kinds of interesting topics
StanfordHCI/termite
(development moved to new repos)