princeton-vl/RAFT-Stereo

Question about fine-tunning on middlebury

excllent123 opened this issue · 2 comments

Hi
In paper section 4.4. Middlebury,After pre-training on Sceneflow [23], we fine-tune on 384x1000 random crops of the 23 Middlebury traning images for 4000 steps with a batch size of 2, using 22 update iterations during training
but in official_train.txt, only containt 10 Middlebury traning images ,
could you help me point out which is right?

Thanks

We use the 23 images w/ ground-truth in the Middlebury 2014 dataset: https://vision.middlebury.edu/stereo/data/scenes2014/

The official_train.txt corresponds to the 10 evaluation training sets with GT

FYI I have just added a script to download all of the finetuning data and updated the README to specify exactly which command to run.

I'm closing this issue, feel free to reopen.