visinf/irr

how you organize the kitti dataset?

jhuxiang opened this issue · 4 comments

how you organize the kitti dataset?

Hi,

Could you provide more details?
Are you asking about how to split train/valid? or a overall fine-tuning strategy?

Best,
Jun

az-ja commented

Dear Jun,

thank you for the great work.
I have a question about fine-tuning the network on Kitti. Unfortunately in the referred papers I did not find enough info on fine-tuning on Kitti. (initial lr and total number of iterations and exact iterations in which we should halve the step size.)
In the paper (IRR) it is mentioned that the schedule of PWC-Net+ (paper: models matter, so does training) is used, in which I can find no details on the fine-tuning schedule on Kitti.
I also looked at the files you provided under the folder "scripts". There are two files one kitti_train.sh and another one kitti_train_full.sh.
Could you please explain the difference of these two files? If I have a pretrained model (chairs + things3D) which schedule should I consider?
Assuming that the useful schedule for my case is kitti_train, am I right about the following?
For Kitti12 and Kitti15 there are 394 (200 = 194) training samples.
In the file it is specified that 1904 (2064 - 160) training epochs are needed to finetune the network on kitti.
Then the number of training iterations are (1904 * 394)/(batch-size = 4) = 187544.
Is that correct?
I am just curious how exactly this schedule is set and from which reference the file kitti_train.sh is created.
Sorry for the long question and thank you for any hints.
With kind regards,
Azin

Hi,
We used a two-stage finetuning strategy.

  1. Dividing combined KITTI 2012 and 2015 into train/valid set.
    KITTI 2012 (194 images) = train (155 images) + valid (39 images)
    KITTI 2015 (200 images) = train (160 images) + valid (40 images)
    So, we have now the train (315 images) / valid (79 images) split.

  2. Finding the stopping point using "IRR-PWC_kitti_train.sh"
    "IRR-PWC_kitti_train.sh" finds a stopping point (or the number of iteration steps) using Train/Valid split.
    We follow the exact learning rate schedule in Fig. 4 in https://arxiv.org/pdf/1809.05571.pdf, halving the lr at 45k, 65k, 85k, ... iteration steps.
    Those iterations steps are converted in the unit of epochs in the .sh file.
    For example, the first halving point is at 730th epoch.
    The # of iterations per epoch (315 images / 4 batch size) = approx. 79.
    And.. (45k iteration steps) / (79 iters per epoch) + 160 = approx. 730.
    Then, by looking at the validation error, we can find the stopping epoch (or the # of iteration steps).

  3. Then, we fine-tune the network (trained on Chairs -> Things) for the number of epochs found above on all KITTI 394 images ("IRR-PWC_kitti_train_full.sh")
    Here we need to calculate the corresponding number epochs again because the # of training images used is different.
    Say the stopping epoch found above was at 830th, then the corresponding epoch for 394 training images is:
    (830 - 160) * (79 iter per epoch) / (99 iter per epoch) + 160 = 694.6... = approx. 695
    Here, 99 is from (394 images / 4 batch size = 98.5) with rounding up.
    Of course, the same LR schedule was used.

Hope this answers your question!

Best,
Jun

az-ja commented

Dear Jun,

thank you very much for your time and the amazing answer!
Now I am totally clear. The problem is that in the paper you mentioned, they explicitly explain that schedule for Sintel (which has two parts --> once finetuning on clean and final and once only on final). That's why I was confused what exaclty to do in the case of Kitti (I thought perhaps they did something different, as no separate phases of finetuning on Kitti is explained).
Thank you very much again.
My issue is solved :)
With kind regards,
Azin