Questions about the provided checkpoints

Question

Questions about the provided checkpoints

hellochick opened this issue 4 years ago · 2 comments

Hi,

Thank you very much for providing the source code, it's really awesome.

I have several questions about the provided checkpoints,

For the provided dbbSep30-1206_1000000 checkpoint, it seems that the real validation result is different from the score which mentioned it the README section (i.e., 2.07 / 4.07 for Sintel). I ran it on the validation set and got a score of 1.47 / 1.90
for Sintel Val.

There also exists inconsistent between the log file and the provided checkpoint, as the last line of the dbbSep30-1206.log is correct.

I am guessing that this checkpoint is trained on the whole Sintel dataset, am I correct?

From the guess from (1), I try to upload the test results to see if the dbbSep30-1206_1000000 checkpoint can reproduce the results reported on the paper and the website. But I find that there is a gap in them: I got 4.877 / 3.182 on FINAL and CLEAN, respectively, and the reported ones on the paper and website are 4.38 / 2.77.

I would appreciate it if you could help to clarify these questions and provide the checkpoints which can reproduce the results.

Thank you for your time and consideration again!

Answer 1 · 2020-09-26T03:31:25.000Z

Hi hellochick, thanks for your interest in our work!

For your first question, I ran the dbbSep30-1206_1000000 checkpoint on validation set again and still get 2.70/4.07. Did you use the train-validation split file we provided? You probably need to modify reader/sintel.py a little bit, to set the split_file to the correct Sintel_train_val_maskflownet.txt.

For your second question, as we have discussed in our paper, the submission result was obtained by- 1) trained on the complete dataset (dbbSep30-1206_1000000 is trained on train split only); 2) averaged over multiple checkpoints (as the training is quite unstable).

I hope those explanations will help.

Answer 2 · 2020-09-26T04:37:07.000Z

Hi @simon1727 ,

Thank you for your quick reply. I have checked the split file is correct.

And I just found the problem! Please refer to this issue here:
https://stackoverflow.com/questions/4813061/non-alphanumeric-list-order-from-os-listdir

It seems that the 'os.listdir' will return different results in different filesystems, so the

for seq in os.listdir(os.path.join(path, part, subset)):

should be sorted

for seq in sorted(os.listdir(os.path.join(path, part, subset))):

After doing this, I got the results as you reported in the paper.
I have created a PR to fix this issue.

Thank you for your time, and I really appreciate your clarification.