NUS-HPC-AI-Lab/VideoSys

Incorrect sequence parallel for CogVideoX?

monellz opened this issue · 3 comments

CogVideoX cat encoder_hidden_states and hidden_states at seq dim in attn processor. But currently the sequence parallel implementation in videosys seems only split hidden_states at seq dim, and still cat entire encoder_hidden_states and splited hidden_states. The computation semantics of attention after all-to-all appear to differ from those before.

I don't understand why videosys only splits hidden_states at seq dim.

image

The three figures are the first frames generated by 1 gpu (no parallel), 2 gpus (cp_size=2) and 4 gpus (cp_size=2, sp_size=2). While they appear natural, there are notable differences (such as the fallen leaves to the left of the dog) that should not occur.

yeah thats a problem. we will fix soon. thanks for your feedback!

我也遇到了类似的问题,不知道该如何解决

The bug should have been fixed in #218