deepinsight/insightface

mbf + vit transformer, arcface_torch error

eeric opened this issue · 3 comments

eeric commented

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by
making sure all forward function outputs participate in calculating loss.

may because that multi input to model

try this:

backbone = torch.nn.parallel.DistributedDataParallel(
module=backbone, broadcast_buffers=False, device_ids=[local_rank], find_unused_parameters=True)
eeric commented

thanks, but I tried, failedly, because loss=nan that no than 1st epoch.

vit model has been updated