lucastabelini/LaneATT

RuntimeError: CUDA error: device-side assert triggered

987410 opened this issue · 1 comments

Hello,
when i add traced_cpu = torch.jit.trace(model, images.clone().detach()), try to convert model, I got error
model.load_state_dict(self.exp.get_epoch_model(epoch))
model = model.to(self.device)
model.eval()
if on_val:
dataloader = self.get_val_dataloader()
else:
dataloader = self.get_test_dataloader()
test_parameters = self.cfg.get_test_parameters()
predictions = []
self.exp.eval_start_callback(self.cfg)
with torch.no_grad():
for idx, (images, _, _) in enumerate(tqdm(dataloader)):
images = images.to(self.device)
import pdb
pdb.set_trace()
traced_cpu = torch.jit.trace(model, images.clone().detach())
torch.jit.save(traced_cpu, "laneATT.pth")
output = model(images, **test_parameters)

/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [26,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [28,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [29,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [30,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [17,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
0%| | 0/1 [00:03<?, ?it/s]
Traceback (most recent call last):
File "/media/disk_8t/tvm/venvs/laneATT/lib/python3.9/site-packages/torch/jit/_trace.py", line 443, in run_mod_and_filter_tensor_outputs
outs = wrap_retval(mod(*_clone_inputs(inputs)))
RuntimeError: CUDA error: device-side assert triggered

How to fix it? thanks

I have never worked with PyTorch's JIT tracing, so I can't really help you.