hkchengrex/Tracking-Anything-with-DEVA

Error in the middle of a video.

Closed this issue · 2 comments

Hi @hkchengrex
Thanks for the great work. I am trying to track dense objects in an image, e.g. more than 100 objects per image. I can successfully run the code for 50 frames. However, at the 51 frames, an error occurs:

Traceback (most recent call last):
  File "Tracking-Anything-with-DEVA/demo/demo_with_trex2.py", line 116, in <module>
    process_frame(deva,
  File "/home/jiangqing/miniconda3/envs/deva/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "Tracking-Anything-with-DEVA/deva/ext/with_text_processor.py", line 91, in process_frame_with_text
    prob = deva.incorporate_detection(this_image, mask,
  File "Tracking-Anything-with-DEVA/deva/inference/inference_core.py", line 193, in incorporate_detection
    self._add_memory(image, ms_features, self.last_mask, key, shrinkage, selection)
  File "Tracking-Anything-with-DEVA/deva/inference/inference_core.py", line 73, in _add_memory
    value, sensory = self.network.encode_mask(image,
  File "Tracking-Anything-with-DEVA/deva/model/network.py", line 54, in encode_mask
    g16, h16 = self.mask_encoder(image,
  File "/home/jiangqing/miniconda3/envs/deva/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "Tracking-Anything-with-DEVA/deva/model/big_modules.py", line 108, in forward
    g_chunk = self.bn1(g_chunk)  # 1/2, 64
  File "/home/jiangqing/miniconda3/envs/deva/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jiangqing/miniconda3/envs/deva/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
    return F.batch_norm(
  File "/home/jiangqing/miniconda3/envs/deva/lib/python3.10/site-packages/torch/nn/functional.py", line 2438, in batch_norm
    return torch.batch_norm(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
Tracking-Anything-with-DEVA/deva/inference/image_feature_store.py:48: UserWarning: Leaking dict_keys([51, 50, 52]) in the image feature store

I am not familiar with the function image_feature_store.py. Can you share some insight about this bug? BTW, here are some visualization results from the model's out/. It just worked for the first 50 frames😂
00000019
00000030

This error does not relate to image_feature_store but rather that some of the inputs are non-contiguous. I tested the automatic SAM demo on this SORA video and it worked fine past 100+ frames (SAM failed to detect a lot of them but no errors were thrown). There might be some problem with your demo_with_trex2 implementation. Notably, after 50 frames, we start to "remove" objects from the scene which might lead to non-contiguous output on your end if you are referring to the object indices in any way. The easiest way might just be to slap .contiguous() on suspicious tensors.

Feel free to re-open if there are follow-up questions.