THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-7838/cutorch/lib/THC/generic/THCStorage.c line=32 error=59 : device-side assert triggered in train.lua
suhmily opened this issue · 3 comments
I got the following error when running train.lua:
/tmp/luarocks_cunn-scm-1-9864/cunn/lib/THCUNN/ClassNLLCriterion.cu:25: void cunn_ClassNLLCriterion_updateOutput_kernel1(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int) [with Dtype = float]: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes
failed.
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-7838/cutorch/lib/THC/generic/THCStorage.c line=32 error=59 : device-side assert triggered
/data/home/suzhou/torch/install/bin/luajit: cuda runtime error (59) : device-side assert triggered at /tmp/luarocks_cutorch-scm-1-7838/cutorch/lib/THC/generic/THCStorage.c:32
stack traceback:
[C]: at 0x7f3ad501a130
[C]: in function '__index'
...hou/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:52: in function 'updateOutput'
...torch/install/share/lua/5.1/nn/CrossEntropyCriterion.lua:13: in function 'forward'
train.lua:208: in function 'eval_split'
train.lua:334: in main chunk
[C]: in function 'dofile'
...zhou/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
I got the same error...how can I solve this problem?
@superdarkness it caused by ClassNLLCriterion which only allow [1...class_n] as inputs. Preposed data selected 1000 answers for classification, and marked others -1. One way to solve this problem might be : 1. change -1 to 1001. 2.Set class numbers to 1001 instead of 1000. The other solution is : drop questions with answer -1 in the training set.
@suhmily thank you so much for your advice,this error has been solved successfully!