DeAOT training & inference.

Since you uploaded the code of XMem, could you please also provide the train_datasets.py and eval_datasets.py of aot-benchmark for MOSE?
And did you change the training config, or same as YTB, like
self.DATA_MOSE_REPEAT = 1, self.DATA_RANDOM_GAP_MOSE = 3

Thanks a lot!

Hi Jiaming,

Thank you for your valuable suggestions and attention to MOSE! Done as suggested.

We did not change the training config, the same as default setting of YTB.

Thanks for you reply!
But there seems to be a problem with the two files in the deaot.

MOSE dose not have meta.json like YouTubeVOS?

MOSE-api/DeAOT_train_datasets.py

Lines 601 to 602 in bd13baf

self.seq_list_file = os.path.join(root, 'meta_train.json')

self._check_preprocess()

MOSE's file structure is split outside of Annotations and JPEGImages

MOSE-api/DeAOT_eval_datasets.py

Lines 292 to 293 in bd13baf

    
           self.image_root = os.path.join(root, 'JPEGImages', split) 
        
           self.label_root = os.path.join(root, 'Annotations', split)

It better should be:
self.image_root = os.path.join(root, split, 'JPEGImages')
self.label_root = os.path.join(root, split, 'Annotations')

Thank you for your careful check👍

Currently no, while we consider providing a meta.json soon, which plans to include more information. You could ignore Line 602 or generate a meta.json by yourself, which does not affect the model training.
Yes you're right, updated accordingly. It depends on how you arrange the files. On my local machine it is JPEGImages/train.

Thank you again for your suggestions and all the best to your work!

If I just ignore Line 602, how should I filter out short objects like L612-L619?

In addition, should I filter out short videos?

Thanks again!

Hi Jiaming,

Thanks for you reply! But there seems to be a problem with the two files in the deaot.

MOSE dose not have meta.json like YouTubeVOS?

MOSE-api/DeAOT_train_datasets.py

Lines 601 to 602 in bd13baf

self.seq_list_file = os.path.join(root, 'meta_train.json')

self._check_preprocess()

MOSE's file structure is split outside of Annotations and JPEGImages

MOSE-api/DeAOT_eval_datasets.py

Lines 292 to 293 in bd13baf

self.image_root = os.path.join(root, 'JPEGImages', split)

self.label_root = os.path.join(root, 'Annotations', split)

It better should be: self.image_root = os.path.join(root, split, 'JPEGImages') self.label_root = os.path.join(root, split, 'Annotations')

We are glad to announce that we just updated meta files for the dataset. Please check them out! The current file structure is organized for better packaging&uploading. Feel free to reorganize the folder structures as you like after you download.

If I just ignore Line 602, how should I filter out short objects like L612-L619?

In addition, should I filter out short videos?

Thanks again!

In terms of filtering out short videos as you've mentioned, there are no strict requirements. So you can decide whether to filter out videos based on the method you are using.

Enjoy!

Thank you for your active response!

	self.seq_list_file = os.path.join(root, 'meta_train.json')
	self._check_preprocess()

	self.image_root = os.path.join(root, 'JPEGImages', split)
	self.label_root = os.path.join(root, 'Annotations', split)