mbf + vit transformer, arcface_torch error
eeric opened this issue · 3 comments
eeric commented
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True
to torch.nn.parallel.DistributedDataParallel
, and by
making sure all forward
function outputs participate in calculating loss.
may because that multi input to model
anxiangsir commented
try this:
backbone = torch.nn.parallel.DistributedDataParallel(
module=backbone, broadcast_buffers=False, device_ids=[local_rank], find_unused_parameters=True)
eeric commented
thanks, but I tried, failedly, because loss=nan that no than 1st epoch.
anxiangsir commented
vit model has been updated