cublas runtime error : library not initialized at /home/username/torch/extra/cutorch/lib/THC/THCGeneral.c:405
flyingintoskyq opened this issue · 0 comments
flyingintoskyq commented
Hi~ When I apply a pre-trained model using DATA_ROOT=./datasets/ae_photos name=style_cezanne_pretrained model=one_direction_test phase=test loadSize=256 fineSize=256 resize_or_crop="scale_width" th test.lua
, I got the problem: cublas runtime error : library not initialized at /home/myuser/torch/extra/cutorch/lib/THC/THCGeneral.c:405
The whole message is below:
------------------- Options -------------------
DATA_ROOT: ./datasets/ae_photos
align_data: 0
aspect_ratio: 1
batchSize: 1
cache_dir: ./cache
checkpoints_dir: ./checkpoints
continue_train: 1
cudnn: 1
display: 1
display_id: 200
fineSize: 256
flip: 0
gpu: 1
how_many: all
input_nc: 3
loadSize: 256
model: one_direction_test
nThreads: 1
name: style_cezanne_pretrained
norm: instance
output_nc: 3
phase: test
resize_or_crop: scale_width
results_dir: ./results/
serial_batches: 1
test: 1
which_direction: AtoB
which_epoch: latest
GPU Mode
cudnn : 1
results_dir : "./results/"
resize_or_crop : "scale_width"
name : "style_cezanne_pretrained"
which_direction : "AtoB"
visual_dir : "/home/flyintoskyq/Desktop/CycleGAN-master/checkpoints/style_cezanne_pretrained/visuals"
phase : "test"
batchSize : 1
fineSize : 256
continue_train : 1
nThreads : 1
aspect_ratio : 1
loadSize : 256
gpu : 1
test : 1
DATA_ROOT : "./datasets/ae_photos"
align_data : 0
which_epoch : "latest"
model : "one_direction_test"
cache_dir : "./cache"
norm : "instance"
how_many : "all"
input_nc : 3
display : 1
output_nc : 3
flip : 0
checkpoints_dir : "./checkpoints"
display_id : 200
serial_batches : 1
DataLoader UnalignedDataLoader was created.
Starting donkey with id: 1 seed: 8350
table: 0x401f9d88
table: 0x419bc8a0
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
cmd..wc -L '/tmp/lua_KFQ4nU' |cut -f1 -d' '
205 samples found......................... 0/205 .......................................] ETA: 0ms | Step: 0ms
Updating classList and imageClass appropriately
[======================================== 1/1 ========================================>] Tot: 0ms | Step: 0ms
Cleaning up temporary files
Dataset Size A: 205
Starting donkey with id: 1 seed: 7589
table: 0x41cbb000
table: 0x4143a480
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
cmd..wc -L '/tmp/lua_TAkCVd' |cut -f1 -d' '
205 samples found......................... 0/205 .......................................] ETA: 0ms | Step: 0ms
Updating classList and imageClass appropriately
[======================================== 1/1 ========================================>] Tot: 0ms | Step: 0ms
Cleaning up temporary files
Dataset Size B: 205
use InstanceNormalization
loading previously trained model (/home/flyintoskyq/Desktop/CycleGAN-master/checkpoints/style_cezanne_pretrained/latest_net_G.t7)
use InstanceNormalization
---------- # Learnable Parameters --------------
G_A = 2855811
processing batch 1
pathsA {
1 : "40.jpg"
pathsB nil
/home/flyintoskyq/torch/install/bin/luajit: ...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:67:
In 2 module of nn.Sequential:
/home/flyintoskyq/torch/install/share/lua/5.1/nn/THNN.lua:110: cublas runtime error : library not initialized at /home/flyintoskyq/torch/extra/cutorch/lib/THC/THCGeneral.c:405
stack traceback:
[C]: in function 'v'
/home/flyintoskyq/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput'
...yq/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...yq/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76>
[C]: in function 'xpcall'
...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
...lyintoskyq/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
./models/one_direction_test_model.lua:52: in function 'Forward'
test.lua:100: in main chunk
[C]: in function 'dofile'
...skyq/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
...lyintoskyq/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
./models/one_direction_test_model.lua:52: in function 'Forward'
test.lua:100: in main chunk
[C]: in function 'dofile'
...skyq/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
However, if I apply CPU mode instead of GPU mode, it works properly.
So, is my gpu memory not enough? How to solve the problem? Could you please give me any advice?
My environment information: Ubuntu 16.04, Nvidia GeForce RTX 2060, gpu memory 5896MB, cuda v10.1, cudnn v7.6.4.
Thanks a lot! :)