Out of memory exception when using allow_gc=False in .theanorc

Question

Out of memory exception when using allow_gc=False in .theanorc

Closed this issue 7 years ago · 1 comments

my config:
Windows 10
Using cuDNN version 5110 on context None
Mapped name None to device cuda0: GeForce GTX 1080 (0000:01:00.0)
theano: 0.9.0.dev-a5c029dcacb2a763e606a3526b6a4b224728d5e2
numpy: 1.11.3
pygpu: 0.6.4
lasagne: 0.2.dev1

related part of .theanorc:

optimizer_excluding =low_memory
#allow_gc=True # works ok
allow_gc=False # generates an out of mem exception (see below)

note:

No bug raised if using allow_gc=True
The error occurs at the time the networks finishes its first training epoch and switches to validation in the same epoch (ie switches function)
The GPU memory is far from being saturated when the error occurs

error

File "C:\Users\MachineLearning\AppData\Local\Continuum\Anaconda3\lib\site-packages\theano\compile\function_module.py", line 884, in call
self.fn() if output_subset is None else
File "pygpu\gpuarray.pyx", line 1462, in pygpu.gpuarray.pygpu_concatenate (pygpu/gpuarray.c:19524)
File "pygpu\gpuarray.pyx", line 417, in pygpu.gpuarray.array_concatenate (pygpu/gpuarray.c:7374)
pygpu.gpuarray.GpuArrayException: b'out of memory'

discussion: would this be related to a possible defect in GPU mem management/sync / GPU array creation... as flagged in https://github.com/Theano/libgpuarray/issues/408 ???

Answer 1 · 2017-04-17T18:00:25.000Z

allow_gc=False disables garbage collection and memory freeing of intermediate results, so the peak memory usage will be higher during execution.
It will also keep the memory for intermediate results after the function has executed, in your case, the training function still has memory reserved when the validation starts requesting new memory.
It is possible that the Exception raised triggers freeing some memory in the valid function, giving the illusion that there is enough free memory.

An option for you would be to call train_fn.free() before switching functions.
Also, on GPU, allow_gc=False does not really speed things up compared to the default memory allocator.