Very slow loss.backward() when running PointMLP on custom task
kaimingkuang opened this issue · 1 comments
kaimingkuang commented
Hi,
I am trying to adopt the PointMLP in classification_ModelNet40/models/pointmlp.py
on my own task (the default hyperparameter setting). However, the loss.backward()
gets super slow (around 9 seconds for one backward for one batch of 64 pointclouds with 1024 points). When I run your own ModelNet40 experiments with the same configs and hardware/software environment, the training speed is normal. Here is my code:
self.model.train()
for i, sample in enumerate(self.dl_train):
self.optimizer.zero_grad()
pc = sample["xyz"]
img_feats = sample["img"]
pc = pc.cuda()
img_feats = img_feats.cuda()
pc_feats = self.model(pc)
pc2img_loss = self.criterion(pc_feats, img_feats)
pc2img_loss.backward()
self.optimizer.step()
The loss function is a simple contrastive loss:
class ContrastiveLoss(nn.Module):
def forward(self, feat_0, feat_1, labels=None):
feat_0 = F.normalize(feat_0, dim=1)
feat_1 = F.normalize(feat_1, dim=1)
dot_prods = torch.einsum("mi,ni->mn", feat_0, feat_1)
loss_0_1 = -F.log_softmax(dot_prods, dim=0).diag().mean()
loss_1_0 = -F.log_softmax(dot_prods, dim=1).diag().mean()
loss = 0.5 * (loss_0_1 + loss_1_0)
return loss
Here is my hardware/software configs:
CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
GPU: NVIDIA A100
CUDA: 11.1
PyTorch: 1.8.1
Python: 3.7.16
Can you help me with it? Many thanks.
kaimingkuang commented
Switched to PyTorch 1.12.1 and it runs so much faster...