Language: English | 中文
Build | Coverage | Report | GoDoc | RTD | Demo |
---|---|---|---|---|---|
gorse
is an offline recommender system backend based on collaborative filtering written in Go.
This project is aim to provide a high performance, easy-to-use, programming language irrelevant recommender micro-service based on collaborative filtering. We could build a simple recommender system on it, or set up a more sophisticated recommender system using candidates generated by it. It features:
- Implements 7 rating based recommenders and 4 ranking based recommenders.
- Supports data loading, data splitting, model training, model evaluation and model selection.
- Provides the data import/export tool, model evaluation tool and RESTful recomender server.
- Accelerates computations by SIMD instructions and multi-threading.
For more information:
- Visit GoDoc for detailed documentation of codes.
- Visit ReadTheDocs for tutorials, examples and usages.
- Visit SteamLens for a Steam games recommender system based on gorse.
- Download from release.
- Build from source:
Install Golang and run go get
:
$ go get github.com/zhenghaoz/gorse/...
It will download all packages and build the gorse
command line into your $GOBIN
path.
If your CPU supports AVX2 and FMA3 instructions, use the avx2
build tag to enable AVX2 and FMA3 instructions.
$ go get -tags='avx2' github.com/zhenghaoz/gorse/...
gorse is an offline recommender system backend based on collaborative filtering written in Go.
Usage:
gorse [flags]
gorse [command]
Available Commands:
export-feedback Export feedback to CSV
export-items Export items to CSV
help Help about any command
import-feedback Import feedback from CSV
import-items Import items from CSV
serve Start a recommender sever
test Test a model by cross validation
version Check the version
Flags:
-h, --help help for gorse
Use "gorse [command] --help" for more information about a command.
gorse provides the tool to evaluate models. We can run gorse test -h
or check online documents to learn its usage. For example:
$ gorse test bpr --load-csv u.data --csv-sep $'\t' --eval-precision --eval-recall --eval-ndcg --eval-map --eval-mrr
...
+--------------+----------+----------+----------+----------+----------+----------------------+
| | FOLD 1 | FOLD 2 | FOLD 3 | FOLD 4 | FOLD 5 | MEAN |
+--------------+----------+----------+----------+----------+----------+----------------------+
| Precision@10 | 0.321041 | 0.327128 | 0.321951 | 0.318664 | 0.317197 | 0.321196(±0.005931) |
| Recall@10 | 0.212509 | 0.213825 | 0.213336 | 0.206255 | 0.210764 | 0.211338(±0.005083) |
| NDCG@10 | 0.380665 | 0.385125 | 0.380003 | 0.369115 | 0.375538 | 0.378089(±0.008974) |
| MAP@10 | 0.122098 | 0.123345 | 0.119723 | 0.116305 | 0.119468 | 0.120188(±0.003883) |
| MRR@10 | 0.605354 | 0.601110 | 0.600359 | 0.577333 | 0.599930 | 0.596817(±0.019484) |
+--------------+----------+----------+----------+----------+----------+----------------------+
u.data
is the CSV file of ratings in MovieLens 100K dataset and u.item
is the CSV file of items in MovieLens 100K dataset. All CLI tools are listed in the CLI-Tools section of Wiki.
It's easy to setup a recomendation service with gorse
.
- Step 1: Import feedback and items.
$ gorse import-feedback ~/.gorse/gorse.db u.data --sep $'\t' --timestamp 2
$ gorse import-items ~/.gorse/gorse.db u.item --sep '|'
It imports feedback and items from CSV files into the database file ~/.gorse/gorse.db
. The low level storage engine is implemented by BoltDB.
- Step 2: Start a server.
$ gorse serve -c config.toml
It loads configurations from config.toml and start a recommendation server. It may take a while to generate all recommendations. Detailed information about configuration is in the Configuration section of Wiki. Before set hyper-parameters for the model, it is useful to test the performance of chosen hyper-parameters by the model evaluation tool.
- Step 3: Get recommendations.
$ curl 127.0.0.1:8080/recommends/1?number=5
It requests 5 recommended items for the 1-th user. The response might be:
[
{
"ItemId": "919",
"Popularity": 96,
"Timestamp": "1995-01-01T00:00:00Z",
"Score": 1
},
{
"ItemId": "474",
"Popularity": 194,
"Timestamp": "1963-01-01T00:00:00Z",
"Score": 0.9486470268850127
},
...
]
"ItemId"
is the ID of the item and "Score"
is the score generated by the recommendation model used to rank. See RESTful APIs in Wiki for more information about RESTful APIs.
Also, gorse could be imported and used in Go application. There is an example that fits a recommender and generate recommended items:
package main
import (
"fmt"
"github.com/zhenghaoz/gorse/base"
"github.com/zhenghaoz/gorse/core"
"github.com/zhenghaoz/gorse/model"
)
func main() {
// Load dataset
data := core.LoadDataFromBuiltIn("ml-100k")
// Split dataset
train, test := core.Split(data, 0.2)
// Create model
bpr := model.NewBPR(base.Params{
base.NFactors: 10,
base.Reg: 0.01,
base.Lr: 0.05,
base.NEpochs: 100,
base.InitMean: 0,
base.InitStdDev: 0.001,
})
// Fit model
bpr.Fit(train, nil)
// Evaluate model
scores := core.EvaluateRank(bpr, test, train, 10, core.Precision, core.Recall, core.NDCG)
fmt.Printf("Precision@10 = %.5f\n", scores[0])
fmt.Printf("Recall@10 = %.5f\n", scores[1])
fmt.Printf("NDCG@10 = %.5f\n", scores[1])
// Generate recommendations for user(4):
// Get all items in the full dataset
items := core.Items(data)
// Get user(4)'s ratings in the training dataset
excludeItems := train.User("4")
// Get top 10 recommended items (excluding rated items) for user(4) using BPR
recommendItems, _ := core.Top(items, "4", 10, excludeItems, bpr)
fmt.Printf("Recommend for user(4) = %v\n", recommendItems)
}
The output should be:
2019/11/14 08:07:45 Fit BPR with hyper-parameters: n_factors = 10, n_epochs = 100, lr = 0.05, reg = 0.01, init_mean = 0, init_stddev = 0.001
2019/11/14 08:07:45 epoch = 1/100, loss = 55451.70899118173
...
2019/11/14 08:07:49 epoch = 100/100, loss = 10093.29427682404
Precision@10 = 0.31699
Recall@10 = 0.20516
NDCG@10 = 0.20516
Recommend for 4-th user = [288 313 245 307 328 332 327 682 346 879]
There are 11 recommendation models implemented by gorse.
Model | Data | Task | Multi-threading Fit | |||
---|---|---|---|---|---|---|
explicit | implicit | weight | rating | ranking | ||
BaseLine | ✔️ | ✔️ | ✔️ | |||
NMF | ✔️ | ✔️ | ✔️ | |||
SVD | ✔️ | ✔️ | ✔️ | |||
SVD++ | ✔️ | ✔️ | ✔️ | ✔️ | ||
KNN | ✔️ | ✔️ | ✔️ | ✔️ | ||
CoClustering | ✔️ | ✔️ | ✔️ | ✔️ | ||
SlopeOne | ✔️ | ✔️ | ✔️ | ✔️ | ||
ItemPop | ✔️ | ✔️ | ✔️ | |||
KNN (Implicit) | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | |
WRMF | ✔️ | ✔️ | ✔️ | ✔️ | ||
BPR | ✔️ | ✔️ | ✔️ |
- Cross-validation of rating models on MovieLens 1M [Source].
Model | RMSE | MAE | Time | (AVX2) |
---|---|---|---|---|
SlopeOne | 0.90683 | 0.71541 | 0:00:26 | |
CoClustering | 0.90701 | 0.71212 | 0:00:08 | |
KNN | 0.86462 | 0.67663 | 0:02:07 | |
SVD | 0.84252 | 0.66189 | 0:02:21 | 0:01:48 |
SVD++ | 0.84194 | 0.66156 | 0:03:39 | 0:02:47 |
- Cross-validation of ranking models on MovieLens 100K [Source].
Model | Precision@10 | Recall@10 | MAP@10 | NDCG@10 | MRR@10 | Time |
---|---|---|---|---|---|---|
ItemPop | 0.19081 | 0.11584 | 0.05364 | 0.21785 | 0.40991 | 0:00:03 |
KNN | 0.28584 | 0.19328 | 0.11358 | 0.34746 | 0.57766 | 0:00:41 |
BPR | 0.32083 | 0.20906 | 0.11848 | 0.37643 | 0.59818 | 0:00:13 |
WRMF | 0.34727 | 0.23665 | 0.14550 | 0.41614 | 0.65439 | 0:00:14 |
gorse
is much faster than Surprise, and comparable to librec while using less memory space than both of them. The memory efficiency is achieved by sophisticated data structures.
- Cross-validation of SVD on MovieLens 100K [Source]:
- Cross-validation of SVD on MovieLens 1M [Source]:
Any kind of contribution is expected: report a bug, give a advice or even create a pull request.
gorse
is inspired by following projects:
gorse
has limitations and might not be applicable to some scenarios:
- No Scalability:
gorse
is a recommendation service on a single host, so it's unable to handle large data. - No Features:
gorse
exploits interactions between items and users while features of items and users are ignored.