nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [18,0,0] Assertion t >= 0 && t < n_classes failed.
Alexia1994 opened this issue · 4 comments
ERROR:ignite.engine.engine.Engine:Current run is terminating due to exception: unsupported operand type(s) for /: 'str' and 'int'.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [4,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [8,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [9,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [10,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [11,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [12,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [13,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [14,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [15,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [16,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [17,0,0] Assertion t >= 0 && t < n_classes
failed.
/opt/conda/conda-bld/pytorch_1634272168290/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [18,0,0] Assertion t >= 0 && t < n_classes
failed.
ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: unsupported operand type(s) for /: 'str' and 'int'.
Traceback (most recent call last):
File "train.py", line 237, in
train()
File "train.py", line 225, in train
trainer.run(train_loader, max_epochs=args.n_epochs)
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 850, in run
return self._internal_run()
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 952, in _internal_run
self._handle_exception(e)
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 716, in _handle_exception
raise e
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 937, in _internal_run
hours, mins, secs = self._run_once_on_dataset()
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 705, in _run_once_on_dataset
self._handle_exception(e)
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 716, in _handle_exception
raise e
File "/home/work/zhangao/ZoooSP/python/miniconda3/envs/xxd/lib/python3.7/site-packages/ignite/engine/engine.py", line 688, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "train.py", line 130, in update
loss = lm_loss / args.gradient_accumulation_steps
TypeError: unsupported operand type(s) for /: 'str' and 'int'
加载thu-coai/CDial-GPT_LCCC-large预训练模型后,想用toy_train.txt finetune一下,得到如上报错。请问该如何处理?
Hi,
请问你找到解决方法了吗?
Thanks.
请问您用的命令是什么?
分类数num_class看写对了没?!