/butterfree

A tool for building feature stores.

Primary LanguagePythonApache License 2.0Apache-2.0

Butterfree

Made with ❤️ by the MLOps team from QuintoAndar

This library supports Python version 3.6+ and meant to provide tools for building ETL pipelines for Feature Stores using Apache Spark.

The library is centered on the following concetps:

  • ETL: central framework to create data pipelines. Spark-based Extract, Transform and Load modules ready to use.
  • Declarative Feature Engineering: care about what you want to compute and not how to code it.
  • Feature Store Modeling: the library easily provides everything you need to process and load data to your Feature Store.

To understand the main concepts of Feature Store modeling and library main features you can check Butterfree's Wiki.

To learn how to use Butterfree in practice, see Butterfree's notebook examples

Requirements and Installation

Butterfree depends on Python 3.6+ and it is Spark 3.0 ready ✔️

Python Package Index hosts reference to a pip-installable module of this library, using it is as straightforward as including it on your project's requirements.

pip install quintoandar-butterfree --extra-index-url https://quintoandar.github.io/python-package-server/

Or after listing quintoandar-butterfree in your requirements.txt file:

pip install -r requirements.txt --extra-index-url https://quintoandar.github.io/python-package-server/

You may also have access to our preview build (unstable) by installing from staging branch:

pip install git+https://github.com/quintoandar/butterfree.git@staging

Documentation

The official documentation is hosted on Read the Docs

License

TBD

Contributing

All contributions are welcome! Feel free to open Pull Requests. Check the development and contributing guidelines described here.