kahst/BirdCLEF2017

Error in training on own data

divsidhu-26 opened this issue · 1 comments

I got the following error on trying to train with my own data. I followed the instructions in README.md .
Can you explain what is happening?

Traceback (most recent call last):
File "birdCLEF_train.py", line 791, in
loss = train_net(image_batch, target_batch, lr)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in call
outputs = self.fn()
MemoryError:
Apply node that caused the error: Elemwise{sqr,no_inplace}(Elemwise{Sub}[(0, 0)].0)
Toposort index: 156
Inputs types: [TensorType(float64, 4D)]
Inputs shapes: [(128, 128, 64, 128)]
Inputs strides: [(8388608, 65536, 1024, 8)]
Inputs values: ['not shown']
Outputs clients: [[Sum{axis=[0, 2, 3], acc_dtype=float64}(Elemwise{sqr,no_inplace}.0)]]

kahst commented

This is a memory error, typically raised when the GPU runs out of memory. There are basically two things you can do: Reduce the Batch Size or reduce the net complexity (less filters). You can also reduce the image resolution, but that requires more adjustments and might not be as applicable as the other two strategies.