/smore

SMORe: Modularize Graph Embedding for Recommendation

Primary LanguageC++MIT LicenseMIT

Build Status license Gitter chat for developers at https://gitter.im/dmlc/xgboost

SMORe

This is a C++ framework for variant weighted network embedding techniques. We currently release the command line interface for following models:

In the near future, we will redesign the framework making some solid APIs for fast development on different network embedding techniques.

Developed Environment

  • g++ > 4.9 (In macOS, it needs OpenMP-enabled compilers. e.g. brew reinstall gcc6 --without-multilib)

Compilation

$ git clone https://github.com/cnclabs/smore
$ cd smore
$ make

Task

Given a network input:

userA itemA 3
userA itemC 5
userB itemA 1
userB itemB 5
userC itemA 4

The model learns the representations of each vertex:

6 5
userA 0.0815412 0.0205459 0.288714 0.296497 0.394043
itemA -0.207083 -0.258583 0.233185 0.0959801 0.258183
itemC 0.0185886 0.138003 0.213609 0.276383 0.45732
userB -0.0137994 -0.227462 0.103224 -0.456051 0.389858
itemB -0.317921 -0.163652 0.103891 -0.449869 0.318225
userC -0.156576 -0.3505 0.213454 0.10476 0.259673

Command Line Interface

Directly call the execution file to see the usage like:

./cli/deepwalk
./cli/walklets
./cli/line
./cli/hpe
./cli/app
./cli/mf
./cli/bpr
./cli/warp
./cli/hoprec

then you will see the options description like:

Options Description:
        -train <string>
                Train the Network data
        -save <string>
                Save the representation data
        -dimensions <int>
                Dimension of vertex representation; default is 64
        -undirected <int>
                Whether the edge is undirected; default is 1
        -negative_samples <int>
                Number of negative examples; default is 5
        -window_size <int>
                Size of skip-gram window; default is 5
        -walk_times <int>
                Times of being staring vertex; default is 10
        -walk_steps <int>
                Step of random walk; default is 40
        -threads <int>
                Number of training threads; default is 1
        -alpha <float>
                Init learning rate; default is 0.025
Usage:
./deepwalk -train net.txt -save rep.txt -undirected 1 -dimensions 64 -walk_times 10 -walk_steps 40 -window_size 5 -negative_samples 5 -alpha 0.025 -threads 1

Example Script

This shell script will help obtain the representations of the Youtube links in Youtube-links dataset.

cd example
sh train_youtube.sh

Changing the number of threads in train_youtube.sh could speedup the process.

Running with Docker

  • Running with locally built image
    • Building docker image which is created following the instructions of Dockerfile
    docker build -t smore:latest .
    • Running container instantiated by image.
    docker run -it --name smore --rm -v "$PWD":/usr/local/smore/data smore:latest model_name -train training_dataset -save embedding [model_options]
    • Example:
    docker run -it --name smore --rm -v "$PWD":/usr/local/smore/data smore:latest  hpe -train net.txt -save rep.txt  
  • Running with published image
    • Running smore.sh
    ./smore.sh model_name -train training_dataset -save embedding [model_options]
    • Example:
    ./smore.sh model_name -train training_dataset -save embedding [model_options]
    

Related Work

You can find related work from awesome-network-embedding.

Citation

@inproceedings{smore,
author = {Chen, Chih-Ming and Wang, Ting-Hsiang and Wang, Chuan-Ju and Tsai, Ming-Feng},
title = {SMORe: Modularize Graph Embedding for Recommendation},
year = {2019},
booktitle = {Proceedings of the 13th ACM Conference on Recommender Systems},
series = {RecSys ’19}
}
@article{pronet2017,
  title={Vertex-Context Sampling for Weighted Network Embedding},
  author={Chih-Ming Chen and Yi-Hsuan Yang and Yian Chen and Ming-Feng Tsai},
  journal={arXiv preprint arXiv:{1711.00227}},
  year={2017}
}

Note

for HOP-REC & CSE, it is required to assign the field of each vertex in "vertex field" form:

userA u
userB u
userC u
itemA i
itemB i
itemC i
itemD i

by -field argument.