OOM occurs when adding certain box prompts.
MrIsland opened this issue · 4 comments
Dear IDEA-Research Team,
I am working on continuity detection in indoor scenes and have been using a video predictor for this task. Specifically, I first tagged each image using RAM and selected the appropriate bounding boxes with GroundingDINO. These bounding boxes were then used as prompts for SAM2.
However, I encountered an Out of Memory (OOM) issue when too many bbox prompts are entered. I would like to ask if there is something wrong with the way I am using this? And if there is a better way to accomplish this continuity detection?
Looking forward for your reply !
Best regards,
Island
You can try a smaller SAM 2
model like sam2_hiera_tiny
or you can only add tags in a few images to reduce the box nums, or you can offload the video to cpu.
Thank you for your reply !
The OOM error occur in function self._consolidate_temp_output_across_obj
at h600 in sam2/sam2_video_predictor.py
I was wondering, during the video's inference process, is it necessary to store all the frames on the GPU simultaneously for processing? Or is it possible to process the detection of each frame individually without disrupting the continuity between frames, assuming each frame has some bounding boxes? Is that the point you say 'you can offload the video to cpu' ?
I think SAM 2
support offload video or state
to CPU to reduce the memory cost: https://github.com/facebookresearch/segment-anything-2/blob/7e1596c0b6462eb1d1ba7e1492430fed95023598/sam2/sam2_video_predictor.py#L43 you can refer to this args here for more details.
Thanks again for your patience and answers, I have no more questions.