For accompanying overleaf file, see https://www.overleaf.com/13236006mvbvdwgpwnxq
- Improved estimator for K_{n-1}
- For positive alpha, epsilon = alpha. Accumulate discarded probability
- Synthetic data with Gaussian emissions
- Compare (synthetic Gaussian data) with a NRM model using Alex Tank's VI
- Other estimates of q^pr
- NRM model
- MC estimate
- 1st order Taylor expansions
- 2nd order Taylor expansions
- Case of negative alpha (what does this mean?)
- Tune alpha using CV
- Instantiate clusters randomly
- Metrics for coclustering matrix
- predictive log-likelihood metric
- Expectation propagation
- Coclustering and Gaussian plots
- Earthquake data
- Kaggle movies
- Malicious activities
- Train on bigger Amazon data on ziz
- Preallocate matrices ?
- (possible) Faster log likelihood of Dirichlet-Multinomial distribution ?
- Gibbs sampler for partition- and graph-valued data
- Gibbs sampler for mixture model
Obtain training data as follows
- Download http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Musical_Instruments_5.json.gz
- Run
gunzip reviews_Musical_Instruments_5.json.gz
- Run
sed '1s/^/[/;$!s/$/,/;$s/$/]/' reviews_Musical_Instruments_5.json > reviews.json
Add the following packages:
- JSON
- ProgressMeter
- Distributions
For the TextAnalysis package, use:
Pkg.checkout("TextAnalysis")
rather than the conventional Pkg.add
to get the master branch