Drawbacks of conditional BRUNO compared to RNN BRUNO (not a bug)
christabella opened this issue · 1 comments
Hello, thank you for open-sourcing the code! I have a few high-level questions about the models:
1. Why is validation only done for RNN and not conditional?
In the original RNN version, there is validation done during training:
Lines 201 to 208 in c631d3d
Whereas in the conditional version, eval_loss
is never used:
bruno/config_conditional/train.py
Lines 79 to 84 in c631d3d
2. Is conditional BRUNO not maximizing joint (conditional) log likelihood?
BRUNO is clearly maximizing the joint log likelihood:
However, conditional BRUNO does not seem to be maximizing the joint conditional log likelihood... or is it?
3. "Conditional de Finetti" is not guaranteed
Do you think this is this a problem, or not really since in practice it works nonetheless?
Thank you very much!
Hello! Thank you for the questions!
Q1: the conditional code was hastily written, so it might be missing some pieces. I also didn't look much at the validation scores as far as I remember.
Q2: I think it does maximize the conditional joint log-likelihood.
Q3: From a theory point of view, it's probably unsatisfactory that there is no proof. I would be happier if there was one. Though, if the goal is to make a working model like with all deep learning, then I guess it's fine.