/BetaNeutralToTheLeft.jl

Code to accompany UAI paper 'Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks'

Primary LanguageJulia

NTL.jl

For accompanying overleaf file, see https://www.overleaf.com/13236006mvbvdwgpwnxq

TODO

Priority

  • Improved estimator for K_{n-1}
  • For positive alpha, epsilon = alpha. Accumulate discarded probability
  • Synthetic data with Gaussian emissions
  • Compare (synthetic Gaussian data) with a NRM model using Alex Tank's VI
  • Other estimates of q^pr
    • NRM model
    • MC estimate
    • 1st order Taylor expansions
    • 2nd order Taylor expansions
  • Case of negative alpha (what does this mean?)
  • Tune alpha using CV
  • Instantiate clusters randomly
  • Metrics for coclustering matrix
  • predictive log-likelihood metric
  • Expectation propagation
  • Coclustering and Gaussian plots

Data sources

  • Earthquake data
  • Kaggle movies
  • Malicious activities

Secondary

  • Train on bigger Amazon data on ziz
  • Preallocate matrices ?
  • (possible) Faster log likelihood of Dirichlet-Multinomial distribution ?

Ben

  • Gibbs sampler for partition- and graph-valued data
  • Gibbs sampler for mixture model

Data

Obtain training data as follows

  1. Download http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Musical_Instruments_5.json.gz
  2. Run gunzip reviews_Musical_Instruments_5.json.gz
  3. Run sed '1s/^/[/;$!s/$/,/;$s/$/]/' reviews_Musical_Instruments_5.json > reviews.json

Julia packages

Add the following packages:

  • JSON
  • ProgressMeter
  • Distributions

For the TextAnalysis package, use:

Pkg.checkout("TextAnalysis")

rather than the conventional Pkg.add to get the master branch