Error in feature_transform_regularizer: torch.norm does work very well on cuda

Question

Error in feature_transform_regularizer: torch.norm does work very well on cuda

MCLYang opened this issue 5 years ago · 1 comments

pytorch version: 1.4.0
torch.version.cuda: 10.0

Has any one engaged the same error as me? the torch.norm does work very well if I do on cuda:

def feature_transform_regularizer(trans):
    d = trans.size()[1]
    batchsize = trans.size()[0]
    I = torch.eye(d)[None, :, :]
    if trans.is_cuda:
        I = I.cuda()
    my_norm = torch.norm(torch.bmm(trans, trans.transpose(2,1)) - I, dim=(1,2)) 
    loss = torch.mean(my_norm)
    return loss
pred = torch.rand(100,2).cuda(0)
target = torch.randint(0,2,(100,)).cuda(0)
trans_feat = torch.rand(2,64,64).cuda(0)
loss = F.nll_loss(pred, target)
loss += feature_transform_regularizer(trans_feat) * 0.001

it returns the error like

----> 7 my_norm = torch.norm(torch.bmm(trans, trans.transpose(2,1)) - I, dim=(1,2))
RuntimeError: Could not run 'aten::conj.out' with arguments from the 'CUDATensorId' backend. 'aten::conj.out' is only available for these backends: [CPUTensorId, VariableTensorId].

However, if do on CPU then it's fine:

pred = torch.rand(100,2).cpu()
target = torch.randint(0,2,(100,)).cpu()
trans_feat = torch.rand(2,64,64).cpu()
loss = F.nll_loss(pred, target)
loss += feature_transform_regularizer(trans_feat) * 0.001

I feel the question is not essential because computing this loss on CPU does not lower down the training too much, but I can't not believe the torch.norm cannot work correctly on cuda. Did I did something wrong? Appreciate any hope.

Answer 1 · 2020-11-25T13:24:25.000Z

I have the same question. Do you have any other way to solve it?