microsoft/DeepSpeedExamples

zero3 and enable hybrid engine are not suitable for llama2, how to solve it?

terence1023 opened this issue · 3 comments

In my experiment, I found that if I use zero3 and enable hybrid engine setting, the Actor will generate repeat token or nothing during stage 3 (PPO) training. Here is an example:

image

Besides, I took some other experiments:
Experiment 1: use the zero2 and enable hybrid engine, which is fine.

Experiment 2: use the zero3 and disenable hybrid engine, which is fine.

Experiment 3: I tested the OPT model, which is fine in zero3, and enable hybrid engine setting.

I'm confused by this phenomenon and want to know how to fix it. That said, due to time and GPU memory constraints, I want to use zero3 and enable hybrid engine settings.

@arashb @ShadenSmith @jeffra @selfReference Please help to check this problem, thanks!

Hi, @terence1023. I am facing the same issue. Could you please let me know if you have found a solution to this issue? Any guidance or suggestions you could provide would be greatly appreciated.

Thank you for your time and assistance.

请问解决了吗