Sea-Snell/Implicit-Language-Q-Learning
Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"
PythonMIT
Issues
- 2
Question about stds reported in the paper
#9 opened by DT6A - 1
Question about the max steps
#8 opened by maxiao94 - 6
Question on indexing in Q loss
#7 opened by DT6A - 2
Error Running Monte Carlo Policy for Wordle
#6 opened by sarahlu0 - 4
A question on beta hyperparameter
#5 opened by DT6A - 1
- 1
Could not find the euclidean distance based reward cache for the visual dialogue task
#4 opened by gaoqitong - 3