TensorFlow Ranking

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform. It contains the following components:

Commonly used loss functions including pointwise, pairwise, and listwise losses.
Commonly used ranking metrics like Mean Reciprocal Rank (MRR) and Normalized Discounted Cumulative Gain (NDCG).
Multi-item (also known as groupwise) scoring functions.
LambdaLoss implementation for direct ranking metric optimization.
Unbiased Learning-to-Rank from biased feedback data.

We envision that this library will provide a convenient open platform for hosting and advancing state-of-the-art ranking models based on deep learning techniques, and thus facilitate both academic research as well as industrial applications.

A quick demo for a ranker on dummy dataset (no setup required):

For more details on this code and data, look at the section on Example Code.

Linux Installation

Stable Builds

To install the latest version from PyPI, run the following:

# Installing with the `--upgrade` flag ensures you'll get the latest version.
pip install --user --upgrade tensorflow_ranking

To force a Python 3-specific install, replace pip with pip3 in the above commands. For additional installation help, guidance installing prerequisites, and (optionally) setting up virtual environments, see the TensorFlow installation guide.

Note: Since TensorFlow is not included as a dependency of the TensorFlow Ranking package (in setup.py), you must explicitly install the TensorFlow package (tensorflow or tensorflow-gpu). This allows us to maintain one package instead of separate packages for CPU and GPU-enabled TensorFlow.

Installing from Source

To build TensorFlow Ranking locally, you will need to install:
- Bazel, an open source build tool.
```
$ sudo apt-get update && sudo apt-get install bazel
```
- Pip, a Python package manager.
```
$ sudo apt-get install python-pip
```
- VirtualEnv, a tool to create isolated Python environments.
```
$ pip install --user virtualenv
```

Clone the TensorFlow Ranking repository.

$ git clone https://github.com/tensorflow/ranking.git

Build TensorFlow Ranking wheel file and store them in /tmp/ranking_pip folder.

$ cd ranking  # The folder which was cloned in Step 2.
$ bazel build //tensorflow_ranking/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow_ranking/tools/pip_package/build_pip_package /tmp/ranking_pip

Install the wheel package using pip. Test in virtualenv, to avoid clash with any system dependencies.

$ ~/.local/bin/virtualenv -p python3 /tmp/tfr
$ source /tmp/tfr/bin/activate
(tfr) $ pip install tensorflow  #  or tensorflow-gpu, if GPU support is needed.
(tfr) $ pip install /tmp/ranking_pip/tensorflow_ranking*.whl

Run all TensorFlow Ranking tests.

(tfr) $ bazel test //tensorflow_ranking/...

Invoke TensorFlow Ranking package in python (within virtualenv).
```
(tfr) $ python -c "import tensorflow_ranking"
```

Example Code

The repository has a running script over a dummy data set in the LIBSVM format.

Running Script

Set up the data and directory.

OUTPUT_DIR=/tmp/output && \
TRAIN=tensorflow_ranking/examples/data/train.txt && \
VALI=tensorflow_ranking/examples/data/vali.txt && \
TEST=tensorflow_ranking/examples/data/test.txt

Build and run.

rm -rf $OUTPUT_DIR && \
bazel build -c opt \
tensorflow_ranking/examples/tf_ranking_libsvm_py_binary && \
./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary \
--train_path=$TRAIN \
--vali_path=$VALI \
--test_path=$TEST \
--output_dir=$OUTPUT_DIR \
--num_features=136 \
--num_train_steps=100

TensorBoard

The training results such as loss and metrics can be visualized using Tensorboard.

(Optional) If you are working on remote server, set up port forwarding with this command.
```
$ ssh <remote-server> -L 8888:127.0.0.1:8888
```

Install Tensorboard and invoke it with the following commands.

(tfr) $ pip install tensorboard
(tfr) $ tensorboard --logdir $OUTPUT_DIR

Jupyter Notebook

An example jupyter notebook using the LIBSVM format is available in tensorflow_ranking/examples/tf_ranking_libsvm.ipynb.

To run this notebook, first follow the steps in installation to set up virtualenv environment with tensorflow_ranking package installed.
Install jupyter within virtualenv.
```
(tfr) $ pip install jupyter
```

Start a jupyter notebook instance on remote server.

(tfr) $ jupyter notebook tensorflow_ranking/examples/tf_ranking_libsvm.ipynb \
        --NotebookApp.allow_origin='https://colab.research.google.com' \
        --port=8888

(Optional) If you are working on remote server, set up port forwarding with this command.
```
$ ssh <remote-server> -L 8888:127.0.0.1:8888
```
Running the notebook.
- Start jupyter notebook on your local machine at http://localhost:8888/ and browse to the ipython notebook.
- An alternative is to use colaboratory notebook via colab.research.google.com and open the notebook in the browser. Choose local runtime and link to port 8888.

References

Rama Kumar Pasumarthi, Xuanhui Wang, Cheng Li, Sebastian Bruch, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, Stephan Wolf. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank. CoRR abs/1812.00073 (2018)
Qingyao Ai, Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Marc Najork. Learning Groupwise Scoring Functions Using Deep Neural Networks. CoRR abs/1811.04415 (2018)
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. Learning to Rank with Selection Bias in Personal Search. SIGIR 2016.
Xuanhui Wang, Cheng Li, Nadav Golbandi, Mike Bendersky, Marc Najork. The LambdaLoss Framework for Ranking Metric Optimization. CIKM 2018.

Citation

If you use TensorFlow Ranking in your research and would like to cite it, we suggest you use the following citation:

   @misc{TensorflowRanking2018,
   author = {Rama Kumar Pasumarthi and Xuanhui Wang and Cheng Li and Sebastian Bruch and Michael Bendersky and Marc Najork and Jan Pfeifer and Nadav Golbandi and Rohan Anil and Stephan Wolf},
   title = {TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank},
   year = {2018},
   eprint = {arXiv:1812.00073},
   }

nzhiltsov/ranking