/awesome-kedro

Plugins, extensions, case studies, articles, and video tutorials for Kedro

Awesome Kedro Awesome

kedro logo

An opinionated Python framework for creating reproducible, maintainable and modular data science code.

Python version PyPI version Conda version License Slack Organisation Slack Archive CircleCI - Main Branch Develop Branch Build Documentation OpenSSF Best Practices Monthly downloads Total downloads

This is an open-source repository to collect together anything related to Kedro such as blog posts, example projects, plugins, videos, and more.

Got something to include? Add your own work to the relevant section with a PR.

Contents

Awards and highlights

Blog posts

In no particular order:

For more:

Companies using Kedro

There are Kedro users across the world, who work at start-ups, major enterprises and academic institutions like Absa, Acensi, Advanced Programming Solutions SL, AI Singapore, AMAI GmbH, Anacision GmbH, Augment Partners, AXA UK, Belfius, Beamery, Caterpillar, CRIM, Dendra Systems, Element AI, GetInData, GMO, Indicium, Imperial College London, ING, Jungle Scout, Helvetas, Leapfrog, McKinsey & Company, Mercado Libre Argentina, Modec, Mosaic Data Science, NaranjaX, NASA, NHS AI Lab, Open Data Science LatAm, Prediqt, Prospect, QuantumBlack, ReSpo.Vision, Retrieva, Roche, Sber, Société Générale, Telkomsel, Universidad Rey Juan Carlos, UrbanLogiq, Wildlife Studios, WovenLight and XP.

Example projects

  • find-kedro - Automatically construct pipelines using pytest style pattern matching.
  • kedro-accelerator - Speeds up pipelines by parallelizing I/O in the background.
  • kedro-airflow - Makes it easy to deploy Kedro projects to Airflow.
  • kedro-airflow-k8s - Enables running a Kedro pipeline with Airflow on a Kubernetes cluster.
  • kedro-argo - Converts Kedro pipelines to Argo pipelines.
  • kedro-auto-catalog - A configurable replacement for kedro catalog create that allows you to create default dataset types other than MemoryDataset.
  • kedro-azureml - Enables running a Kedro pipeline with Azure ML Pipelines service.
  • kedro-dataframe-dropin - Lets you swap out pandas datasets for modin or RAPIDs equivalents for specialised use to speed up your workflows (e.g on GPUs).
  • kedro-datasets - A collection of Kedro data connectors.
  • kedro-docker - Makes it easy to package Kedro projects with Docker.
  • kedro-dolt - Allows you to expand the data versioning abilities of data scientists and engineers
  • kedro-fast-api - kedro fast-api is a kedro plugin to easily create a fast-api for a kedro project for models' deployment.
  • kedro-great - The easiest way to integrate Kedro and Great Expectations.
  • kedro-grpc-server - Creates a gRPC server for your kedro pipelines.
  • kedro-kubeflow - Lets you run and schedule pipelines on Kubernetes clusters using Kubeflow Pipelines.
  • kedro-mlflow - Allows usage of MLFlow in Kedro projects.
  • kedro-neptune - Integration of Kedro with Neptune.ai.
  • kedro-pandas-profiling - "Profiles" data in the catalog. (⚠️public archive)
  • kedro-pandera - Integration of Kedro with Pandera to provide catalog-level data validation.
  • kedro-partitioned - Extends the functionality on processing partitioned data.
  • kedro-sagemaker - Enables running a Kedro pipeline with Amazon SageMaker service.
  • kedro-snowflake - Enables to run full Kedro pipelines in Snowflake.
  • kedro-softfail-runner - Custom Kedro Runner to enable soft-failing pipeline.
  • kedro-static-viz - Generates a static Kedro-Viz site (HTML, CSS, JS)
  • kedro-viz - Helps visualise Kedro data and analytics pipelines.
  • kedro-vertexai - Enables running a Kedro pipeline with Vertex AI Pipelines service.
  • kedro-wings - Automatically creates catalog entries to simplify Kedro pipeline writing.- more-kedro - (Hook) library for on the fly typing and validation of parameter dictionaries and default value backed data loading.
  • steel-toes - Prevent changing downstream catalog data on your teammates while developing in parallel.
  • vineyard-kedro - Custom DataSet and Runner which enables sharing intermediate data between tasks in Kedro pipelines using Vineyard, a cloud-native in-memory object manager.
  • kedro-tf-image - Kedro pipelines for preprocessing images using TensorFlow.
  • kedro-graphql - A Kedro plugin for serving Kedro projects as GraphQL APIs.
  • kedro-boot - Integrate you kedro project with any application
  • kedro-popmon - A Kedro plugin for integration of popmon capabilities.
  • kedro-expectations - Adding Data Validation to Kedro pipelines with up-to-date Great Expectations version.

For more:

Videos

Intros

Howtos