/LARA-1

Latent Aspect Rating Analysis on Review Text Data-It aims at analyzing opinions expressed about an entity in an online review at the level of topical aspects to discover each individual reviewer’s latent opinion on each aspect as well as the relative emphasis on different aspects when forming the overall judgment of the entity.

Primary LanguagePythonMIT LicenseMIT

This is a Python implementation of following research paper from Wang et al:

Latent Aspect Rating Analysis on Review Text Data: A Rating Regression Approach by Hongning Wang, Yue Lu, Chengxiang Zhai. The 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'2010), p783-792, 2010.

This implementation uses the previous implementation given by the author (http://www.cs.virginia.edu/~hw5x/Codes/LARA.zip).

Latent Aspect Rating Analysis (LARA) aims at analyzing opinions expressed about an entity in an online review at the level of topical aspects to discover each individual reviewer’s latent opinion on each aspect as well as the relative emphasis on different aspects when forming the overall judgment of the entity. It uses probabilistic rating regression to solve this problem.

Contributors

Organization of the code

There are two classes Sentence and Review each coded in different python files. These act as data containers for a sentence and a single review respectively.

ReadData contains all functions for processing the reviews. BootStrap class contains the bootstrapping algorithm as explained in the paper. LRR class contains the implementation of Rating Regression algorithm as described in the paper.

  • hotelReviews directory is where the review files go (json encoded) - both Training and Testing data
  • settings directory contains the configuration files for the model
  • modelData will contain the files generated by the model

How to run

Have nltk and scipy installed.

python3 main.py

It will print the results on stdout.