/SEMAFORE

SEMAFORE: SEmantic visualization with MAniFOld REgularization

Primary LanguagePerl

SEMAFORE: SEmantic visualization with MAniFOld REgularization
-------------------------------------------------------------


INTRODUCTION

This is an implementation of SEMAFORE - a semantic visualization method from Le & Lauw (AAAI 2014, JAIR 2016).

Usage:

	perl semafore.pl	--num_topics $num_topics
				--dim $dim
				--lambda $lambda
				--alpha $alpha,
				--beta $beta,
				--gamma $gamma,
				--basis_function $basis_function
				--EM_iter $EM_iter,
				--Quasi_iter $Quasi_iter
				--data $data
				--graph $graph
				--output_file $output_file

Arguments:
	$num_topics: number of topics
	$dim: number of dimensions (default 2)
	$lambda: regularization parameter (default 10)
	$alpha: Dirichlet parameter (default 0.01)
	$beta: covariance for Gaussian prior of topic coordinates (default 0.1*$num_docs)
	$gamma: covariance for Gaussian prior of document coordinates (default 0.1*$num_topics)
	$basis_function: 0 for Gaussian, 1 for Student-t (with 1 degree of freedom) (default 0)
	$EM_iter: number of iterations for EM (default 100)
	$Quasi_iter: maximum iterations of Quasi-Newton (default 10)
	$data: input data
	$graph: neighborhood graph
	$output_file: output file

Details:

+ This implementation needs Algorithm::LBFGS library for quasi-Newton method L-BFGS.
  The library can be downloaded at http://search.cpan.org/~laye/Algorithm-LBFGS-0.16/lib/Algorithm/LBFGS.pm.
  To install,
	
	  cpan Algorithm::LBFGS
	
+ If $graph is not passed, SEMAFORE turns into Probabilistic Latent Semantic Visualization (PLSV) (Iwata et al., 2008)
	
+ Example of input data with 3 documents (numbers are ids of words):
	0 1 1 2 2 3 4 4 5 6 7 7 8 8 8 8 9 10 11 12 13 13 14 14 15 15 15 16
	17 18 19 20 20 21 22 23 24 25 25 25 25 25 25 26 27 27 28 29
	30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 50 51 52 53 54 54 55 56 57 58
			
+ Neighborhood graph is represented by a matrix A: NxN. N is the number of documents and A[i,j]=A[j,i] is the weight of the edge ij.
  For example,
	0 1 0
	1 0 1
	0 1 0

			
HOW TO CITE
If you use SEMAFORE for your research, please cite:

	@article{le16a,
	    title={Semantic Visualization with Neighborhood Graph Regularization},
	    author={Le, Tuan MV and Lauw, Hady W},
	    journal={Journal of Artificial Intelligence Research},
	    volume={55},
	    pages={1091--1133},
	    year={2016}
	}
	
	The paper can be downloaded from: http://jair.org/papers/paper4983.html
	
Or

	@inproceedings{le2014manifold,
	    title={Manifold learning for jointly modeling topic and visualization},
	    author={Le, Tuan MV and Lauw, Hady W},
	    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
	    year={2014}
	}
	
	The paper can be downloaded from: http://www.hadylauw.com/publications/aaai14.pdf


BIBLIOGRAPHY

@inproceedings{iwata2008probabilistic,
    title={Probabilistic latent semantic visualization: topic model for visualizing documents},
    author={Iwata, Tomoharu and Yamada, Takeshi and Ueda, Naonori},
    booktitle={Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining},
    pages={363--371},
    year={2008},
    organization={ACM}
}