/gorse

An offline recommender system backend based on collaborative filtering written in Go

Primary LanguageGoApache License 2.0Apache-2.0

Language: English | 中文

gorse: Go Recommender System Engine

Build Coverage Report GoDoc RTD Demo
build codecov Go Report Card GoDoc Documentation Status Website

gorse is an offline recommender system backend based on collaborative filtering written in Go.

This project is aim to provide a high performance, easy-to-use, programming language irrelevant recommender micro-service based on collaborative filtering. We could build a simple recommender system on it, or set up a more sophisticated recommender system using candidates generated by it. It features:

  • Implements 7 rating based recommenders and 4 ranking based recommenders.
  • Supports data loading, data splitting, model training, model evaluation and model selection.
  • Provides the data import/export tool, model evaluation tool and RESTful recomender server.
  • Accelerates computations by SIMD instructions and multi-threading.

For more information:

  • Visit GoDoc for detailed documentation of codes.
  • Visit ReadTheDocs for tutorials, examples and usages.
  • Visit SteamLens for a Steam games recommender system based on gorse.

Install

  • Download from release.
  • Build from source:

Install Golang and run go get:

$ go get github.com/zhenghaoz/gorse/...

It will download all packages and build the gorse command line into your $GOBIN path.

If your CPU supports AVX2 and FMA3 instructions, use the avx2 build tag to enable AVX2 and FMA3 instructions.

$ go get -tags='avx2' github.com/zhenghaoz/gorse/...

Usage

gorse is an offline recommender system backend based on collaborative filtering written in Go.

Usage:
  gorse [flags]
  gorse [command]

Available Commands:
  export-feedback Export feedback to CSV
  export-items    Export items to CSV
  help            Help about any command
  import-feedback Import feedback from CSV
  import-items    Import items from CSV
  serve           Start a recommender sever
  test            Test a model by cross validation
  version         Check the version

Flags:
  -h, --help   help for gorse

Use "gorse [command] --help" for more information about a command.

Evaluate a Recommendation Model

gorse provides the tool to evaluate models. We can run gorse test -h or check online documents to learn its usage. For example:

$ gorse test bpr --load-csv u.data --csv-sep $'\t' --eval-precision --eval-recall --eval-ndcg --eval-map --eval-mrr
...
+--------------+----------+----------+----------+----------+----------+----------------------+
|              |  FOLD 1  |  FOLD 2  |  FOLD 3  |  FOLD 4  |  FOLD 5  |         MEAN         |
+--------------+----------+----------+----------+----------+----------+----------------------+
| Precision@10 | 0.321041 | 0.327128 | 0.321951 | 0.318664 | 0.317197 | 0.321196(±0.005931)  |
| Recall@10    | 0.212509 | 0.213825 | 0.213336 | 0.206255 | 0.210764 | 0.211338(±0.005083)  |
| NDCG@10      | 0.380665 | 0.385125 | 0.380003 | 0.369115 | 0.375538 | 0.378089(±0.008974)  |
| MAP@10       | 0.122098 | 0.123345 | 0.119723 | 0.116305 | 0.119468 | 0.120188(±0.003883)  |
| MRR@10       | 0.605354 | 0.601110 | 0.600359 | 0.577333 | 0.599930 | 0.596817(±0.019484)  |
+--------------+----------+----------+----------+----------+----------+----------------------+

u.data is the CSV file of ratings in MovieLens 100K dataset and u.item is the CSV file of items in MovieLens 100K dataset. All CLI tools are listed in the CLI-Tools section of Wiki.

Setup a Recommender Server

It's easy to setup a recomendation service with gorse.

  • Step 1: Import feedback and items.
$ gorse import-feedback ~/.gorse/gorse.db u.data --sep $'\t' --timestamp 2
$ gorse import-items ~/.gorse/gorse.db u.item --sep '|'

It imports feedback and items from CSV files into the database file ~/.gorse/gorse.db. The low level storage engine is implemented by BoltDB.

  • Step 2: Start a server.
$ gorse serve -c config.toml

It loads configurations from config.toml and start a recommendation server. It may take a while to generate all recommendations. Detailed information about configuration is in the Configuration section of Wiki. Before set hyper-parameters for the model, it is useful to test the performance of chosen hyper-parameters by the model evaluation tool.

  • Step 3: Get recommendations.
$ curl 127.0.0.1:8080/recommends/1?number=5

It requests 5 recommended items for the 1-th user. The response might be:

[
    {
        "ItemId": "919",
        "Popularity": 96,
        "Timestamp": "1995-01-01T00:00:00Z",
        "Score": 1
    },
    {
        "ItemId": "474",
        "Popularity": 194,
        "Timestamp": "1963-01-01T00:00:00Z",
        "Score": 0.9486470268850127
    },
    ...
]

"ItemId" is the ID of the item and "Score" is the score generated by the recommendation model used to rank. See RESTful APIs in Wiki for more information about RESTful APIs.

Use gorse in Go

Also, gorse could be imported and used in Go application. There is an example that fits a recommender and generate recommended items:

package main

import (
	"fmt"
	"github.com/zhenghaoz/gorse/base"
	"github.com/zhenghaoz/gorse/core"
	"github.com/zhenghaoz/gorse/model"
)

func main() {
	// Load dataset
	data := core.LoadDataFromBuiltIn("ml-100k")
	// Split dataset
	train, test := core.Split(data, 0.2)
	// Create model
	bpr := model.NewBPR(base.Params{
		base.NFactors:   10,
		base.Reg:        0.01,
		base.Lr:         0.05,
		base.NEpochs:    100,
		base.InitMean:   0,
		base.InitStdDev: 0.001,
	})
	// Fit model
	bpr.Fit(train, nil)
	// Evaluate model
	scores := core.EvaluateRank(bpr, test, train, 10, core.Precision, core.Recall, core.NDCG)
	fmt.Printf("Precision@10 = %.5f\n", scores[0])
	fmt.Printf("Recall@10 = %.5f\n", scores[1])
	fmt.Printf("NDCG@10 = %.5f\n", scores[1])
	// Generate recommendations for user(4):
	// Get all items in the full dataset
	items := core.Items(data)
	// Get user(4)'s ratings in the training dataset
	excludeItems := train.User("4")
	// Get top 10 recommended items (excluding rated items) for user(4) using BPR
	recommendItems, _ := core.Top(items, "4", 10, excludeItems, bpr)
	fmt.Printf("Recommend for user(4) = %v\n", recommendItems)
}

The output should be:

2019/11/14 08:07:45 Fit BPR with hyper-parameters: n_factors = 10, n_epochs = 100, lr = 0.05, reg = 0.01, init_mean = 0, init_stddev = 0.001
2019/11/14 08:07:45 epoch = 1/100, loss = 55451.70899118173
...
2019/11/14 08:07:49 epoch = 100/100, loss = 10093.29427682404
Precision@10 = 0.31699
Recall@10 = 0.20516
NDCG@10 = 0.20516
Recommend for 4-th user = [288 313 245 307 328 332 327 682 346 879]

Recommenders

There are 11 recommendation models implemented by gorse.

Model Data Task Multi-threading Fit
explicit implicit weight rating ranking
BaseLine ✔️ ✔️ ✔️
NMF ✔️ ✔️ ✔️
SVD ✔️ ✔️ ✔️
SVD++ ✔️ ✔️ ✔️ ✔️
KNN ✔️ ✔️ ✔️ ✔️
CoClustering ✔️ ✔️ ✔️ ✔️
SlopeOne ✔️ ✔️ ✔️ ✔️
ItemPop ✔️ ✔️ ✔️
KNN (Implicit) ✔️ ✔️ ✔️ ✔️ ✔️
WRMF ✔️ ✔️ ✔️ ✔️
BPR ✔️ ✔️ ✔️
  • Cross-validation of rating models on MovieLens 1M [Source].
Model RMSE MAE Time (AVX2)
SlopeOne 0.90683 0.71541 0:00:26
CoClustering 0.90701 0.71212 0:00:08
KNN 0.86462 0.67663 0:02:07
SVD 0.84252 0.66189 0:02:21 0:01:48
SVD++ 0.84194 0.66156 0:03:39 0:02:47
  • Cross-validation of ranking models on MovieLens 100K [Source].
Model Precision@10 Recall@10 MAP@10 NDCG@10 MRR@10 Time
ItemPop 0.19081 0.11584 0.05364 0.21785 0.40991 0:00:03
KNN 0.28584 0.19328 0.11358 0.34746 0.57766 0:00:41
BPR 0.32083 0.20906 0.11848 0.37643 0.59818 0:00:13
WRMF 0.34727 0.23665 0.14550 0.41614 0.65439 0:00:14

Performance

gorse is much faster than Surprise, and comparable to librec while using less memory space than both of them. The memory efficiency is achieved by sophisticated data structures.

  • Cross-validation of SVD on MovieLens 100K [Source]:

  • Cross-validation of SVD on MovieLens 1M [Source]:

Contributors

Any kind of contribution is expected: report a bug, give a advice or even create a pull request.

Acknowledgments

gorse is inspired by following projects:

Limitations

gorse has limitations and might not be applicable to some scenarios:

  • No Scalability: gorse is a recommendation service on a single host, so it's unable to handle large data.
  • No Features: gorse exploits interactions between items and users while features of items and users are ignored.