Issues
- 1
- 0
[QUESTION] fp32_size assert
#49 opened by liqing9399 - 3
[QUESTION] May I ask what tool was used to plot Figure 6 in paper.How can I profile bubble time in pipeline parallelism?
#18 opened by starstream - 1
- 1
[ENHANCEMENT] Opensourcing of 1F1B-V and the variant of interleaved 1F1B and benchmark
#31 opened by ufotalent - 1
[ENHANCEMENT] Runtime and scheduler refactor so that we support uniform scheduler API
#30 opened by ufotalent - 4
[QUESTION] I used the zero-bubble commit 7ad9c81d for my experiments, and found that the memory usage of this zb-v model exceeds that of the previous zb1 model. What could be the issue? The specific configuration and results are shown in the image.
#36 opened by lbk-sys - 1
- 1
[QUESTION] I know that zero-bubble is developed based on Megatron-LM. Could you please let me know which specific commit of Megatron-LM zero-bubble is based on?
#35 opened by lbk-sys - 0
[ENHANCEMENT] Sync code base to newest megatron
#29 opened by ufotalent - 0
[ENHANCEMENT] Adaptive V scheduler
#28 opened by ufotalent - 0
- 1
Per stage memory control
#8 opened by mavenlin - 0
- 5
- 3
- 1
- 0
[QUESTION]1f1b is fast then zero-v
#24 opened by kuangdao - 3
interleaved 1F1B seems to work better
#21 opened by zhj96 - 2
- 1
More general ZBV scheduling
#9 opened by mavenlin - 4
[QUESTION] Whether to split bw when send_backward_recv_forward is not enabled
#17 opened by AndSonder - 1
Support sequence parallel on main branch
#13 opened by ufotalent - 5
Create a miniversion containing only ZB-H1 and essential changes so other megatron forks can easily integrate
#10 opened by ufotalent - 1
[QUESTION] Post-validation Optimizer
#16 opened by Parallel-Hao - 0