This is a set of scikit-learn compatible implementations of label propagation (LP) like algorithms. One can easily grid search and cross validate models using utils in scikit-learn.
- Harmonic Function (HMN) [Zhu+, ICML03]
- Local and Global Consistency (LGC) [Zhou+, NIPS04]
- Partially Absorbing Random Walk (PARW) [Wu+, NIPS12]
- OMNI-Prop (OMNIProp) [Yamaguchi+, AAAI15]
- Confidence-Aware Modulated Label Propagation (CAMLP) [Yamaguchi+, SDM16]
python main.py hmn -g sample.edgelist -l sample.label -o sample.output
$ cat sample.edgelist # [src node id] [dst node id]
0 1
1 2
2 3
$ cat sample.label # [node id] [label id]
1 0
2 1
$ cat sample.modulation # KxK matrix (K: no. of labels)
0 1
1 0
$ python main.py hmn -h
usage: main.py hmn [-h] -g GRAPHFILE -l LABELFILE [-o [OUTFILE]]
optional arguments:
-h, --help show this help message and exit
-g GRAPHFILE, --graphfile GRAPHFILE
input graph file
-l LABELFILE, --labelfile LABELFILE
input label file
-o [OUTFILE], --outfile [OUTFILE]
output file (default=STDOUT)
$ python main.py lgc -h
usage: main.py lgc [-h] -g GRAPHFILE -l LABELFILE [-o [OUTFILE]]
[--alpha [ALPHA]]
optional arguments:
-h, --help show this help message and exit
-g GRAPHFILE, --graphfile GRAPHFILE
input graph file
-l LABELFILE, --labelfile LABELFILE
input label file
-o [OUTFILE], --outfile [OUTFILE]
output file (default=STDOUT)
--alpha [ALPHA] alpha (default=0.99)
$ python main.py parw -h
usage: main.py parw [-h] -g GRAPHFILE -l LABELFILE [-o [OUTFILE]]
[--lamb [LAMB]]
optional arguments:
-h, --help show this help message and exit
-g GRAPHFILE, --graphfile GRAPHFILE
input graph file
-l LABELFILE, --labelfile LABELFILE
input label file
-o [OUTFILE], --outfile [OUTFILE]
output file (default=STDOUT)
--lamb [LAMB] lambda (default=1.0)
$ python main.py omni -h
usage: main.py omni [-h] -g GRAPHFILE -l LABELFILE [-o [OUTFILE]]
[--lamb [LAMB]]
optional arguments:
-h, --help show this help message and exit
-g GRAPHFILE, --graphfile GRAPHFILE
input graph file
-l LABELFILE, --labelfile LABELFILE
input label file
-o [OUTFILE], --outfile [OUTFILE]
output file (default=STDOUT)
--lamb [LAMB] lambda (default=1.0)
$ python main.py camlp -h
usage: main.py camlp [-h] -g GRAPHFILE -l LABELFILE [-o [OUTFILE]]
[--beta [BETA]] [--modulationfile [MODULATIONFILE]]
optional arguments:
-h, --help show this help message and exit
-g GRAPHFILE, --graphfile GRAPHFILE
input graph file
-l LABELFILE, --labelfile LABELFILE
input label file
-o [OUTFILE], --outfile [OUTFILE]
output file (default=STDOUT)
--beta [BETA] beta (default=0.1)
--modulationfile [MODULATIONFILE]
modulation matrix file (default: use identity)
In [1]: import numpy as np
In [2]: import networkx as nx
In [3]: from label_propagation import LGC
In [4]: from scipy.sparse import lil_matrix
In [5]: A = lil_matrix((4,4)) # adjacency matrix
In [6]: A[0,1]=1; A[1,0]=1
In [7]: A[1,2]=1; A[2,1]=1
In [8]: A[2,3]=1; A[3,2]=1
In [9]: A.todense() # simple undirected chain
Out[9]:
matrix([[ 0., 1., 0., 0.],
[ 1., 0., 1., 0.],
[ 0., 1., 0., 1.],
[ 0., 0., 1., 0.]])
In [10]: x_train = np.array([1,2])
In [11]: y_train = np.array([0,1]) # node 1 -> label 0, node 2 -> label 1
In [12]: clf = LGC(graph=A, alpha=0.99)
In [13]: clf.fit(x_train,y_train) # scikit-learn compatible
Out[13]:
LGC(alpha=0.99,
graph=<4x4 sparse matrix of type '<type 'numpy.float64'>'
with 6 stored elements in LInked List format>,
max_iter=30)
In [14]: x_test = np.array([0,3]) # to predict labels of node 0 and node 3
In [15]: clf.predict(x_test) # scikit-learn compatible
Out[15]: array([0, 1])