juglab/EmbedSeg

How can I reduce memory for inference

r-matsuzaka opened this issue ยท 9 comments

Hi.

I tried separately ran the notebook[bbbc010-2012] for inference provided by this repo but I had a memory allocation issue.
I used batch size as 1.

Is there any other parameters to reduce memory requirement?

Also I set normalization_factor = 32767 if data_type=='8-bit' else 255
instead of normalization_factor = 65535 if data_type=='16-bit' else 255.

But nothing changed.

Hello @r-matsuzaka

Did you run out of memory while running the train notebook ? Typically the bbbc010-2012 train notebook consumes around 1GiB of GPU memory. Can you check your current memory consumption by running nvidia-smi?

The normalization_factor was not intended to be a free user parameter as such. But changing 65535 to be another large value is also okay, as long as the same value is used during 03-predict notebook.

The memory consumption can be reduced further by additionally reducing the crop_size in 01-data notebook, I guess. But before doing that I would check if the GPU memory is not attached to some other process.

Did you run out of memory while running the train notebook ?

No. I run out of memory only during inference notebook.

Typically the bbbc010-2012 train notebook consumes around 1GiB of GPU memory. Can you check your current memory consumption by running nvidia-smi?

I used 1249MiB out of 16280MiB after finishsing training in train notebook.
Is it high?

The normalization_factor was not intended to be a free user parameter as such. But changing 65535 to be another large value is also okay, as long as the same value is used during 03-predict notebook.

I undestand. I have used the same normalization_factor from training to inference.

The memory consumption can be reduced further by additionally reducing the crop_size in 01-data notebook, I guess. But before doing that I would check if the GPU memory is not attached to some other process.

Thank you very much. I am considering smaller crop size. Or I will switch to use my custom images.

The code stopped here.
Also, Number of center images is 0 is okay?

2-D `test` dataloader created! Accessing data from ../input/embedsegdatav2/bbbc010-2012/test/
Number of images in `test` directory is 50
Number of instances in `test` directory is 50
Number of center images in `test` directory is 0
*************************
Creating branched erfnet with [4, 1] classes
  0%|          | 0/50 [00:00<?, ?it/s]

Hello.

"Also, Number of center images is 0 is okay?"

That is okay for the test images

"The code stopped here."

Usually it takes a few seconds before you see a progress bar.
Was there an error message suggesting that this is linked to memory allocation? (==> just wondering, why do you think this is an out-of-memory issue)

Usually it takes a few seconds before you see a progress bar.
Was there an error message suggesting that this is linked to memory allocation? (==> just wondering, why do you think this is an out-of-memory issue)

Yes, kaggle kernel says so.

"Yes, kaggle kernel says so."

I see. Interesting. Haven't tried the kaggle kernel myself.
One guess is it could be a CPU memory issue maybe since you seem to have not maxed your allocated GPU memory, not sure!

Could you try commenting lines 295 to 302 here maybe, just to see if it is a quick fix perhaps?

"I am considering smaller crop size."

Also just for completion since it is an inference problem, a smaller crop size will not make a difference. (Smaller crop sizes would have been helpful if training was a bottleneck)

One guess is it could be a CPU memory issue maybe since you seem to have not maxed your allocated GPU memory

I could run notebook for training but could not run notebook for inference.
I could not check GPU comsumption due to memory allocation error.
You mean the memory comsumption for inference is the same as for training?

Could you try commenting lines 295 to 302 here maybe, just to see if it is a quick fix perhaps?

Thank you. I will try it.

Also just for completion since it is an inference problem, a smaller crop size will not make a difference. (Smaller crop sizes would have been helpful if training was a bottleneck)

Before switching to my custom dataset, I am now training for dsb-2018 instead of bbbc010-2012 because crop size is smaller.
But it takes more time for training with dsb-2018 than bbbc010-2012. For one epoch, it takes more than one hour...
Is it expected?

For dsb-2018, I had the same situation as bbbc010-2012...

I closed this issue once because the reason could be come from my environment.