JianshuZhang/WAP

Hi, jianshu, thanks for your excellent codes, yet when re-producing your experiments, I find that the newest version of libgpuarray is not compatible with theano 0.10.0, so could you tell me which libgpuarray version do you use in your experment? thanks a lot!

Closed this issue · 8 comments

Hi, jianshu, thanks for your excellent codes, yet when re-producing your experiments, I find that the newest version of libgpuarray is not compatible with theano 0.10.0, so could you tell me which libgpuarray version do you use in your experment? thanks a lot!

Thanks for prompt reply, I tried to run on ubuntu16.04 with theano 0.10.0beta1 libgpuarray 0.6.9, yet it fails. I try to install theano 0.10.0dev, but the version seems gone, maybe I shall try theano 0.9.0?

Fine, I manage to compile the code with theano 0.9.0 and pygpu 0.6.9 cudnn7.0. But it runs into pygpu.gpuarray.GpuArrayException: b'cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory your batch_size is 16, however, the error remains even I change the batch_size to 2 on 12GB Tesla K80 GPU, that makes me confused

I believe the batch_size is 8 not 16 ? I once updated the open source code. Besides the batch_size, the batch_Imagesize and maxImagesize also affect GPU memory use. You can reduce the batch_Imagesize from 500000 to 400000.

image And this is the error screenshot, as we can see, it starts with several iterations yet crashes because of out_of_memory error, and as the training process goes the memory of GPU seems to be increasingly ocuppied :(

'gpuarray.preallocate=0.95' in THEANO_FLAGS is important, make sure you didn't remove it

Ok, after resetting the batchsize and maxImagesize, the GPU occupation seems to be steady, thanks !