Notification

This is a test ropo. The released version is here.

DeepWalk

The implementation includes multi-processing training with CPU and mixed training with CPU and multi-GPU.

Format of a network file:

1(node id) 2(node id)
1 3
...

To run the code:

python3 deepwalk.py --net_file net.txt --emb_file emb.txt --adam --mix --lr 0.2 --num_procs 4 --batch_size 100 --negative 5

Functions:

SkipGramModel.save_embedding(dataset, file_name)
SkipGramModel.save_embedding_txt(dataset, file_name)

To evalutate embedding on multi-label classification, please refer to here

YouTube (1M nodes).

The comparison between running time is shown as below, where the numbers in the brackets denote time used on random-walk.

Implementation	gensim.word2vec(hs)	gensim.word2vec(ns)	Ours
Time (s)	27119.6(1759.8)	10580.3(1704.3)	428.89

Parameters.

walk_length = 80, number_walks = 10, window_size = 5
Ours: 4GPU (Tesla V100), lr = 0.2, batchs_size = 128, neg_weight = 5, negative = 1, num_thread = 4
Others: workers = 8, negative = 5

Speeding-up with mixed CPU & multi-GPU. The used parameters are the same as above.

#GPUs	1	2	4
Time (s)	1419.64	952.04	428.89