NVIDIA/Megatron-LM

Loss mask uses torch.float32 instead of bool

Opened this issue · 1 comments

Hi! I was implementing a custom data loader and faced a deadlock caused by difference in tensors dtype. Then, I discovered that MegatronLM uses torch.float32 instead of bool for loss masks:

loss_mask=torch.empty((args.micro_batch_size,args.seq_length), dtype = torch.float32 , device = torch.cuda.current_device())

It's not directly a bug but is there any reasoning behind it? It seems that using boolean masks is more logical and probably reduces the load on communication between devices.

Marking as stale. No activity in 60 days.