EdoardoPona/predicting-inductive-biases-RL

fork of https://openreview.net/forum?id=mNtmhaDkAr - extending for inductive bias in RL

Python

Issues

plot trl runs
#26 opened a year ago by EdoardoPona
0
reward inconsistency with trl controlled sentiment generation example
#25 opened a year ago by EdoardoPona
2
default number of steps between rl4lm and trl is drastically different
#27 opened a year ago by EdoardoPona
1
Rewards without confidence
#22 opened a year ago by EdoardoPona
1
Loading arbitrary AutoModels in the Lovering code to run mdl
#23 opened a year ago by EdoardoPona
1
remove duplication in lovering main.py and rl_main.py
#24 opened a year ago by EdoardoPona
0
Sentiment task: finding best hyperparameters.
#17 opened a year ago by diogo-cruz
2
Implementing and running different LLM tasks
#19 opened a year ago by diogo-cruz
1
Implement GPT-2 finetuning for sentiment generation
#13 opened a year ago by diogo-cruz
4
Warm up GPT2 on review data
#21 opened a year ago by diogo-cruz
1
Understanding MDL calculations
#16 opened a year ago by diogo-cruz
1
Improve plots
#20 opened a year ago by diogo-cruz
0
Implement sentiment reward
#12 opened a year ago by diogo-cruz
4
batched rewards
#18 opened a year ago by EdoardoPona
0
Implement sentiment dataset
#11 opened a year ago by diogo-cruz
5
Fix issue with generating multi-token
#9 opened a year ago by diogo-cruz
1
Clean up toy+RL setup
#14 opened a year ago by diogo-cruz
0
Results collection for RL models
#6 opened a year ago by EdoardoPona
3
run evaluations on test set with feature combination subsets
#10 opened a year ago by EdoardoPona
1
Implement Alex's reward and test it
#8 opened a year ago by diogo-cruz
1
Load custom transformer in RL4LM
#4 opened a year ago by EdoardoPona
2
Link lovering data with rl4lm dataset class
#5 opened a year ago by EdoardoPona
1