/RePlay

A recommendation framework for distributed model building and evaluation with PySpark.

Primary LanguagePythonApache License 2.0Apache-2.0

RePlay

RePlay is a library providing tools for all stages of creating a recommendation system, from data preprocessing to model evaluation and comparison.

RePlay uses PySpark to handle big data.

You can

  • Filter and split data
  • Train models
  • Optimize hyper parameters
  • Evaluate predictions with metrics
  • Combine predictions from different models
  • Create a two-level model

Docs

Documentation

Installation

Use Linux machine with Python 3.7+, Java 8+ and C++ compiler.

pip install replay-rec

To get the latest development version or RePlay, install it from the GitHab repository. It is preferable to use a virtual environment for your installation.

If you encounter an error during RePlay installation, check the troubleshooting guide.