Sea-Snell/Implicit-Language-Q-Learning

Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"

PythonMIT

Issues

Question about stds reported in the paper
#9 opened a year ago by DT6A
2
Question about the max steps
#8 opened a year ago by maxiao94
1
Question on indexing in Q loss
#7 opened a year ago by DT6A
6
Error Running Monte Carlo Policy for Wordle
#6 opened a year ago by sarahlu0
2
A question on beta hyperparameter
#5 opened a year ago by DT6A
4
Is it possible to release the code based on jax
#3 opened a year ago by sglucas
1
Could not find the euclidean distance based reward cache for the visual dialogue task
#4 opened a year ago by gaoqitong
1
There is no token_reward in wordle train_bc file
#2 opened 2 years ago by ElegantLin
3