QwenLM/Qwen-VL

请问论文中Table 10报告的global batch size是否包含了梯度累计次数?

YanqiDai opened this issue · 0 comments

您好,我想请问《Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities》论文中Table 10报告的global batch size是否包含了梯度累计次数?我没有在文章中找到更明确的定义。
即,global batch size = micro_batch_size * data_parallel_size * gradient_accumulation_steps 还是 global batch size = micro_batch_size * data_parallel_size?