inference: memory runtime error

Question

inference: memory runtime error

Closed this issue 6 years ago · 4 comments

Hi @magnumye !
I'm using a GTX 1080 GPU to reproduce the results using the run_inference.lua.
Everything works great until the num_test param is larger than 2 000.
I've noticed there is not a single call to the garbagecollector and the image data list is loaded onto GPU using local inputs_test = ldata:cuda(). If the available memory isn't enough, the code throws the memory runtime error.

Do you have handy workarounds? Otherwise could you share the disparity images of the training set (since I'm trying to build up on that)? Thanks!

Answer 1 · 2018-11-11T16:24:45.000Z

Hi, your GPU would not have enough memory if the num_test is too big. Unfortunately, I did not save the disparity maps of the training data in the past.
A workaround for this would be to run inference multiple times by separating the dataset into small sets, and load them to GPU one by one for inference.
You can try to modify inside this function evaluate_disp(). Try to use opt.batch_size_test when loading the small set to GPU for inference, and run the for-loop beyond this.

Answer 2 · 2018-11-11T17:08:42.000Z

A workaround for this would be to run inference multiple times by separating the dataset into small sets, and load them to GPU one by one for inference.

Yes, I've already tried with increments of 2 000. Yet the disparities do not seem to work (must be doing someting wrong).

You can try to modify inside this function evaluate_disp(). Try to use opt.batch_size_test when loading the small set to GPU for inference, and run the for-loop beyond this.

What do you mean by try to use the batch_size_test? Do you mean a smaller batch size?

Answer 3 · 2018-11-11T17:21:27.000Z

Might be important: I'm using the Model_no4: Siamese multi.

Answer 4 · 2018-11-11T23:03:19.000Z

The Model_no version should not matter too much in this case. I meant creating a for-loop to call evaluate_disp() and using opt.batch_size_test to go over the dataset that you want to run inference, such that every time you call evaluate_disp(), it will only copy a size of opt.batch_size_test images to GPU.