BhuiyanMH's Stars
devlace/pytest-adf
Pytest plugin for writing Azure Data Factory Integration Tests
nteract/papermill
📚 Parameterize, execute, and analyze notebooks
jupyter/nbdime
Tools for diffing and merging of Jupyter notebooks.
unionai-oss/pandera
A light-weight, flexible, and expressive statistical data testing library
getmoto/moto
A library that allows you to easily mock out tests based on AWS infrastructure.
quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
iterative/dvc
🦉 ML Experiments and Data Management with Git
kedro-org/kedro-viz
Visualise your Kedro data and machine-learning pipelines and track your experiments.
kedro-org/kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
dask/dask
Parallel computing with task scheduling
databricks/koalas
Koalas: pandas API on Apache Spark
great-expectations/great_expectations
Always know what to expect from your data.
PrefectHQ/prefect
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
mlrun/mlrun
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
zenml-io/zenml
ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.
NannyML/nannyml
nannyml: post-deployment data science in python
DataTalksClub/mlops-zoomcamp
Free MLOps course from DataTalks.Club
databrickslabs/tempo
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
oleksandrabovkun/spark-ui-simulator-experiments
drivendataorg/cloudpathlib
Python pathlib-style classes for cloud storage services such as Amazon S3, Azure Blob Storage, and Google Cloud Storage.
faridrashidi/kaggle-solutions
🏅 Collection of Kaggle Solutions and Ideas 🏅
databricks/Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
joho/awesome-code-review
An "Awesome" list of code review resources - articles, papers, tools, etc
oxnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
Victoriapm/awesome-analytics-engineering
Awesome list of resources for analytics engineers
DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
abhishek-ch/around-dataengineering
A Data Engineering & Machine Learning Knowledge Hub
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠