voidful/TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

PythonMIT

Issues

Fix Reward Calculation in `example/2022-12-10-textrl-elon-musk.ipynb`
#27 opened 8 months ago by Alanhsiu
1
unfreeze_layer_from_past parameter
#25 opened a year ago by JhonDan1999
4
Problems in the inference process
#26 opened a year ago by ignorejjj
0
ValueError: Expected parameter logits
#21 opened a year ago by josutk
5
Reward policy agent environment is not training with Finetuned model
#23 opened a year ago by harshs21
1
Does the package support automatic multi-gpu?
#24 opened a year ago by margarita-aicyd
2
Update interval
#17 opened a year ago by debjitpaul
3
Documentation on Methodology
#18 opened a year ago by flyingabove
1
Support for other PFRL Algorithms
#19 opened a year ago by ansharora7
2
Text generation after period/full-stop (".")
#20 opened 2 years ago by ansharora7
0
Could you give some examples to run the code?
#5 opened 2 years ago by SusannaWull
3
Support for AutoModelForSeq2SeqLM
#16 opened 2 years ago by janpf
2
Are there any examples for T5 or Bart? Why T5 and bart give the same output before/after training?
#15 opened 2 years ago by YuXiangLin1234
2
i get error when i use elon example
#12 opened 2 years ago by wac81
6
Text generation models generating repeated/duplicate text/sentences.
#13 opened 2 years ago by tontan1998
3
Backward compatibility
#10 opened 2 years ago by Keith-Hon
2
About the compare_sample
#11 opened 2 years ago by jkwang93
1
AssertionError
#9 opened 2 years ago by Ulov888
3
It needs a license
#8 opened 2 years ago by cooljoseph1
1
AttributeError: module 'numpy' has no attribute '_no_nep50_warning'
#7 opened 2 years ago by GrahamboJangles
1
AttributeError: 'MyRLEnv' object has no attribute 'num_envs'
#6 opened 2 years ago by lucascassiano
2
Errors may occur after changing the batchsize and update interval of the agent
#4 opened 2 years ago by rongaoli
5
'Model' object has no attribute 'lm_head'
#3 opened 3 years ago by Mousumi44
2