/Chinese-LDA-v1.0

This is Tianyu Hong's first version of a program using LDA to predict Chinese message.

Primary LanguageJava

Chinese LDA

/** Copyright by hty

Feel free to contact the following people if you find any problems in the package. hongty106@gmail.com * */

Brief Introduction

  1. This is Tianyu Hong's first version of a program using LDA to predict Chinese message. The training data set is shown in the file named "training data.xml". I put all the crawled data into the xml file. Those data is crawled from some online healthcare communities and of course is in Chinese. Here I just post some data about heart disease.

  2. To use the program, you should have the "WindowBuilder" plugin on your Eclipse.

  3. I use the "Fudan NLP" package to separate the each message and get rid of all the stopwords.

  4. I use Gibbs Sampling. The relative methods are in the package "cn.edu.zju.lda".

Attention: It's possible that the UI seems not beautify to you. Sorry for that, I'm working on the second version and I hope that one would suit your appetite.