hustvl/CrossVIS

How to train CrossVIS on Youtoube-VIS 2021 dataset?

HarryHsing opened this issue · 18 comments

How to train CrossVIS on Youtoube-VIS 2021 dataset?

Hi, @HarryHsing! Thanks for your interest in our work.
To train CrossVIS on YouTube-VIS 2021, you can first registry YouTube-VIS 2021 dataset like this and simply modify config file. Don't forget to modify self.nID in here to the corresponding identity numbers in YouTube-VIS 2021 dataset.
Hope this answer is helpful to you~

Hi, @HarryHsing! Thanks for your interest in our work. To train CrossVIS on YouTube-VIS 2021, you can first registry YouTube-VIS 2021 dataset like this and simply modify config file. Don't forget to modify self.nID in here to the corresponding identity numbers in YouTube-VIS 2021 dataset. Hope this answer is helpful to you~

Thank you very much for your support! I could conduct the training now on YouTube-VIS 2021 with self.nID = 6283.

Hi, @HarryHsing! Thanks for your interest in our work. To train CrossVIS on YouTube-VIS 2021, you can first registry YouTube-VIS 2021 dataset like this and simply modify config file. Don't forget to modify self.nID in here to the corresponding identity numbers in YouTube-VIS 2021 dataset. Hope this answer is helpful to you~

Thank you very much for your support! I could conduct the training now on YouTube-VIS 2021 with self.nID = 6283.

Hello, I reported an error during verification after training, have you ever encountered it?
My training commands are python tools/train_net.py --config configs/CrossVIS/R_50_1x.yaml MODEL.WEIGHTS CondInst_MS_R_50_1x.pth

errors :
[06/01 19:51:24 fvcore.common.checkpoint]: Saving checkpoint to output/CrossVIS_R_50_1x/model_final.pth
[06/01 19:51:24 d2.utils.events]: eta: 0:00:00 iter: 22999 total_loss: 1.401 loss_fcos_cls: 0.109 loss_fcos_loc: 0.103 loss_fcos_ctr: 0.6069 loss_mask: 0.0754 loss_cross_over: 0.08766 loss_reid: 0.4268 time: 0.6739 data_time: 0.0499 lr: 5e-05 max_mem: 7474M
[06/01 19:51:24 d2.engine.hooks]: Overall training speed: 22997 iterations in 4:18:19 (0.6740 s / it)
[06/01 19:51:24 d2.engine.hooks]: Total training time: 4:19:44 (0:01:25 on hooks)
[06/01 19:51:25 adet.data.datasets.youtubevis]: Loaded 13195 images in YOUTUBEVIS format from /media/lin/file/VIS/datasets/youtube-vis2021/valid/instances.json
[06/01 19:51:25 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(360, 360), max_size=640, sample_style='choice')]
[06/01 19:51:26 d2.data.common]: Serializing 3 elements to byte tensors and concatenating them all ...
[06/01 19:51:26 d2.data.common]: Serialized dataset takes 1.78 MiB
WARNING [06/01 19:51:26 d2.engine.defaults]: No evaluator found. Use DefaultTrainer.test(evaluators=), or implement its build_evaluator method.

Val split of YouTube-VIS dataset does not provides annotations for evaluation. So if your training process is already finished, checkpoints can be found, you can simply ignore this error message and follow the instructions in readme to get the predictions.

Ok, thank you. How do I verify if I'm using a VIS dataset I made myself

You can firstly follow readme to get prediction results (in .json format), then use youtubevos-cocoapi to evaluate.

BTW, I notice you run CrossVIS on YouTube-VIS 2021 with default 23000 iters. 23000 is set for the 2019 version (~61000 (imgs) / 32 (batch size) * 12 (epochs) = ~23000. YouTube-VIS 2021 contains more images than 2019 version. So the total iterations and learning rate down steps should be tuned.

BTW, I notice you run CrossVIS on YouTube-VIS 2021 with default 23000 iters. 23000 is set for the 2019 version (~61000 (imgs) / 32 (batch size) * 12 (epochs) = ~23000. YouTube-VIS 2021 contains more images than 2019 version. So the total iterations and learning rate down steps should be tuned.

Hi, Vealocia. May I know your setting of the Learning Rate and Iterations of YouTube-VIS 2021 for reference?

BTW, I notice you run CrossVIS on YouTube-VIS 2021 with default 23000 iters. 23000 is set for the 2019 version (~61000 (imgs) / 32 (batch size) * 12 (epochs) = ~23000. YouTube-VIS 2021 contains more images than 2019 version. So the total iterations and learning rate down steps should be tuned.

Hi, Vealocia. May I know your setting of the Learning Rate and Iterations of YouTube-VIS 2021 for reference?

Learning rate is the same as YouTube-VIS 2019.

BTW, I notice you run CrossVIS on YouTube-VIS 2021 with default 23000 iters. 23000 is set for the 2019 version (~61000 (imgs) / 32 (batch size) * 12 (epochs) = ~23000. YouTube-VIS 2021 contains more images than 2019 version. So the total iterations and learning rate down steps should be tuned.

Hi, Vealocia. May I know your setting of the Learning Rate and Iterations of YouTube-VIS 2021 for reference?

Learning rate is the same as YouTube-VIS 2019.

Well received, thanks!

You can firstly follow readme to get prediction results (in .json format), then use youtubevos-cocoapi to evaluate.

You can firstly follow readme to get prediction results (in .json format), then use youtubevos-cocoapi to evaluate.

Hello,how to verify unofficial datasets using youtubevos/ cocoAPI ? Do you have readily available code? Or can you teach me how to do it?

Hi, @xulinxulin!
You can convert your custom dataset's annotation into YouTube-VIS's format.
Here provides some example codes to evaluation COCO AP. YouTube-VIS AP can be evaluated in the same way.

my trained .pth file also need to be converted into JSON file by test_vis.py?

Is the so-called 'my trained .pth file' your model checkpoint?
You can use checkpoint with test_vis.py to get model's predictions on target videos.

Is the so-called 'my trained .pth file' your model checkpoint? You can use checkpoint with test_vis.py to get model's predictions on target videos.

Yes, I mean after training get model_final.pth, and then use test_vis.py to verify that model_final.pth gets json file, and then use that cocoapi example you just mentioned?Is that the process?

如何在 Youtoube-VIS 2019数据集上训练 CrossVIS?