What is the meaning of padding-free in ReadMe?
Opened this issue · 0 comments
houghtonweihu commented
In the readme, it says:
The OpenChat training system utilizes padding-free training and the Multipack Sampler, achieving a 3~10x speedup compared to the conventional padded training.
What is the meaning of padding-free here? Is there a need for all seqs in one batch
to have the same length? If no padding, how is this done?
Thanks!