/MB_TREC2015

Microblog TREC 2015

Primary LanguagePythonApache License 2.0Apache-2.0

Instruction

There are codes for Microblog TREC 2015.

How to build

  • Step.0 Install the packages pre-requested.
	Java 1.8.0_40
	Python 2.7.9
	Weka 3.6.12
	json-lib-2.1
	numpy 1.6.2
	scipy 0.15.1
	nltk 3.0.2
	word2vec
	gensim
	py4J
	pandas 0.16.2
  • Step.1 Use word2vec.py to train a model with file downloaded from wikipedia. The file is huge so you need to download from wikipedia by yourself. Then type "python process_wiki.py dir" in command line to train a model, dir is the directory where wikipedia english corpus saved, then type "python train_word2vec_model.py dir" to get the model file, dir is the directory of processed wikipedia corpus, the final result file will be named

  • Step.2

#DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT