hkchengrex/Tracking-Anything-with-DEVA

mask problem

Opened this issue · 9 comments

I'm using the evaluation method that involves masking images and using a .json file to track objects. However, I'm facing an issue where the masks I initially apply are not the same as those returned afterward. There's a significant decline in quality, and sometimes the masks don't appear at all. Do you have any suggestions?

python evaluation/eval_with_detections.py --mask_path C:/Workflow_hiba/3_Tracking/source --img_path C:/Workflow_hiba/3_Tracking/images --dataset demo --temporal_setting semionline --output C:/Workflow_hiba/3_Tracking/output222 --chunk_size 1

Can you be more specific? If the input masks are inconsistent, the voting algorithm in the semi-online setting might rule them out.

  1. Despite inputing a precise mask for the object I wish to track in successive frames, the segmentation performed by Deva produces masks of poor quality.
  2. Sometimes, there are two masks visible in the initial image, but only one mask is present in the output.
  3. In some instances, although the input shows two masks, the resulting output mask contains only one object that appears split into two parts, completely disregarding the second object.
  4. Occasionally, objects are improperly tracked even when they are clearly present in the input layer, or the tracking system incorrectly identifies them as new objects.

I believe the challenges stem from the nature of my data. I'm attempting to track lampposts while the camera is in motion, and the size of the lampposts varies from frame to frame.

Do you have any suggestions for improving tracking, especially considering training the model on stationary objects while the camera is in motion, rather than the conventional approach of tracking moving objects?

comparaison.pdf
here a file that contain the input mask and the output mask of the tracking .

Thank you for the update. Is there only one frame as input per video?

thank you for interacting ,no there more than 100 frames

I mean annotated frames with masks.

i created a directory containing :
images =>100=>{100 frames ]
source =>100=>[100mask+100json }

For debugging, can you try tracking with just one mask (i.e., the first frame)? This isolates the tracking from the detections.

Can you elaborate plz. because the first frame can appear good.