YifeiZhou02/ArCHer
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
Python
Issues
- 4
Please tell me, after I complete the detective game task, how can I reset the new task?
#16 opened by SakuraXiaMF - 5
100 Webshop????
#17 opened by xiaxiaxiatengxi - 7
WebShop Experiment
#6 opened by symoon11 - 1
Where is detective game
#15 opened by xiaxiaxiatengxi - 3
- 1
Model Training Unstable(webshop,gpt2)
#14 opened by RobertXWL - 3
Running webshop in a distributional way
#12 opened by DZ9 - 2
请问我们的工作中,对于webshop的训练有提前进行SFT模型么?
#4 opened by xiaxiaxiatengxi - 2
Will there be a useBaseline update?
#11 opened by cuts2k - 5
Speed up the training with Mistral 7B
#10 opened by Bohemianc - 2
- 1
Some Question about tokenizer length
#8 opened by xiaxiaxiatengxi - 1
也许QV网络不太稳定?
#7 opened by xiaxiaxiatengxi - 1
llama2-7B训练webshop效果越来越差了
#5 opened by xiaxiaxiatengxi - 2
您好,我在运行webshop环境的时候遇到了一些问题
#3 opened by xiaxiaxiatengxi - 1
这个工作可以直接用于LLama2么
#2 opened by xiaxiaxiatengxi - 1