YifeiZhou02/ArCHer

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"

Python

Issues

Please tell me, after I complete the detective game task, how can I reset the new task?
#16 opened 2 months ago by SakuraXiaMF
4
100 Webshop？？？？
#17 opened 3 months ago by xiaxiaxiatengxi
5
WebShop Experiment
#6 opened 9 months ago by symoon11
7
Where is detective game
#15 opened 3 months ago by xiaxiaxiatengxi
1
As training progresses, sampling time becomes longer and longer
#13 opened 4 months ago by destinyyzy
3
Model Training Unstable（webshop，gpt2）
#14 opened 4 months ago by RobertXWL
1
Running webshop in a distributional way
#12 opened 4 months ago by DZ9
3
请问我们的工作中，对于webshop的训练有提前进行SFT模型么？
#4 opened 9 months ago by xiaxiaxiatengxi
2
Will there be a useBaseline update?
#11 opened 5 months ago by cuts2k
2
Speed up the training with Mistral 7B
#10 opened 6 months ago by Bohemianc
5
Issues with loading in `lm_optimizer_state_dict`
#9 opened 7 months ago by starship006
2
Some Question about tokenizer length
#8 opened 7 months ago by xiaxiaxiatengxi
1
也许QV网络不太稳定？
#7 opened 7 months ago by xiaxiaxiatengxi
1
llama2-7B训练webshop效果越来越差了
#5 opened 9 months ago by xiaxiaxiatengxi
1
您好，我在运行webshop环境的时候遇到了一些问题
#3 opened 9 months ago by xiaxiaxiatengxi
2
这个工作可以直接用于LLama2么
#2 opened 10 months ago by xiaxiaxiatengxi
1
pre-trained weights for re-producing results
#1 opened 10 months ago by sufengniu
1