Pinned issues
Issues
- 2
- 24
Support RLOO/GRPO/REINFORCE?
#68 opened by fzyzcjy - 1
[ray] latest ray compatibility
#46 opened by eric-haibin-lin - 6
Do we have plans for data packing?
#53 opened by YixinSong-e - 2
Missing doc about "Algorithm Baselines"
#66 opened by fzyzcjy - 1
- 1
- 1
package confilct
#62 opened by hljjjmssyh - 1
Are optimizer states reloaded or offloaded during the conversion from actor training to actor rollout?
#42 opened by G1aZzz - 3
Question about recomputation in actor module
#51 opened by 0oshowero0 - 4
- 5
Hangs during vllm rollout, no error message
#12 opened by Vamix - 2
- 3
Docker image support
#8 opened by SolenoidWGT - 1
Does this framework support full parameter PPO tuning for the Qwen2.5-14B model on 8-A100 GPUs with 80GB memory each?
#40 opened by hljjjmssyh - 1
- 2
enable_gradient_checkpointing is not working
#26 opened by Vamix - 2
Why create_colocated_worker_cls and spawn
#29 opened by eelxpeng - 1
- 2
Unexpected Increase in Rollout Time After Reducing num_hidden_layers in deepseek-llm-7b-chat Model
#24 opened by metaqiang - 4
Why the `magatron_v4.patch` is needed?
#14 opened by hxdtest - 5
- 0
[Roadmap] veRL Development Roadmap
#22 opened by PeterSH6 - 4
关于数据和参数切分的性能测试问题
#16 opened by metaqiang - 14
有提供性能调试的手段吗?
#11 opened by metaqiang - 0
- 2
KeyError: 'raw_prompt'
#13 opened by YixinSong-e - 2
启动训练脚本出现偶发性ray.exceptions.ActorDiedError错误
#10 opened by metaqiang - 1
Can I run ppo in llama3.1-70B-instruct?
#6 opened by cingtiye