kgori/sigfit

get_loglik

Opened this issue · 3 comments

Hi there! Thanks for such a great tool. I was wondering whether the get_loglik function has been fully implemented? When I try to use it in in my code I get the error:
"Error in get_loglik(mcmc_samples) :
could not find function "get_loglik"

I tried copying in the function from your utils.R code and using that, but I get the error "Error in UseMethod("extract") :
no applicable method for 'extract' applied to an object of class "stanfit""

No worries if this utility to calculate the log-likelihood isn't ready yet, I could write my own, but I just wanted to check before I did! Thanks so much :)

kgori commented

Hi Annabel,
If I remember right, "get_loglik" was part of an experiment we were doing to see if we could use LOOIC to help us choose the number of signatures to extract. This didn't seem to work very well, so "get_loglik" was abandoned. Thanks for reminding me, though. I should take another look at this, either to finish it off properly or remove it.
Kevin

Thanks so much, Kevin, that would be great! We were going to use log-likelihoods to compare different signature models as an additional measure of fit beyond cosine similarity. Based your experience saying that using likelihoods didn't work very well, do you have any specific warnings or behaviors in mind where using likelihoods instead of cosine similarity caused problems for you? Thanks again! :)

Hi Annabel,
The experience we had when trying LOOIC, BIC and AIC was that these criteria didn't properly penalise higher model complexity, and usually favoured the largest possible number of signatures, even if this was much higher than the actual number of signatures in the data. In our experience, the inflexion point of the cosine similarity curve can give an indication of what number of signatures is best, but often one needs to look at the sets of extracted signatures manually and find the set with lowest redundancy (i.e. without overlapping or duplicated signatures).
Best,
Adrian