LiyuanLucasLiu/LM-LSTM-CRF

Assertion `t >= 0 && t < n_classes` failed

murthyrudra opened this issue · 0 comments

Hi, I was trying to run train_wc.py on our own dataset in the CoNLL format specified.

But i encountered this error:

/opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes failed.

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generated/../THCReduceAll.cuh line=339 error=59 : device-side assert triggered

Traceback (most recent call last):
File "train_wc.py", line 204, in <module>
loss.backward()
File "/home/rudra/miniconda3/envs/tensorflow/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/rudra/miniconda3/envs/tensorflow/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generated/../THCReduceAll.cuh:339

The ClassNLLCriterion is getting values out of label range. In the file utils.py, function `construct_bucket_vb_wc' creates a data label tensor and I'm not sure what is happening in that function.
Till that function the label range seems to be perfect.