Training settings for RAFT-stereo realtime

Question

Training settings for RAFT-stereo realtime

dilinwang820 opened this issue 2 years ago · 7 comments

Hi there, may I ask what's the number of training iterations for RAFT-stereo realtime? Thank you!

Answer 1 · 2022-10-08T01:04:18.000Z

We trained RAFT-stereo realtime for 200,000 iterations

Answer 2 · 2022-10-08T02:37:35.000Z

Hi @lahavlipson sorry, I was not clear before. I was referring to the number of GRU updates here - https://github.com/princeton-vl/RAFT-Stereo/blob/main/core/raft_stereo.py#L70

As mentioned in the README, the corresponding valid_iters is 7, I am wondering if the train_iters is also 7? Thanks!

Answer 3 · 2022-10-08T03:28:30.000Z

train_iters was also set to 7 for this model

Answer 4 · 2022-10-12T20:20:34.000Z

Hi @lahavlipson I was able to reproduce your "raftstereo-sceneflow.pth" checkpoint with little differences.
However, there's a little bit gap for the realtime model I trained vs. your raftstereo-realtime.pth checkpoint.

Specifically, I constructed the realtime model per your suggestion above -

# model config
hidden_dims=[128] * 3,
shared_backbone=True,
corr_levels=4,
corr_radius=4,
n_downsample=3,
slow_fast_gru=True,
n_gru_layers=2,
train_iters=7, 
valid_iters=7,
freeze_bn=False,

The training is set to be the same as the standard raftstereo setting. The total batch size is 8 on 2 GPUs. And I only trained on the Sceneflow dataset.

# lr scheduler
max_iter = 200000
lr = 1e-4
optimizer = dict(type="AdamW", lr=lr, weight_decay=0.00001)
optimizer_config = dict(grad_clip=dict(max_norm=1))

Anything I might be missing? Any suggestions are greatly appreciated!

Method	kitti15 (3px)	m-f (2px)	m-h (2px)	m-q (2px)	eth3d (1px)
raft-stereo realtime ckp	5.666	18.005	11.364	8.977	5.751
reproduced	6.249	17.354	11.492	9.846	5.725

Answer 5 · 2022-10-15T02:46:16.000Z

Your settings seem fine; the evaluation datasets have few images or are fairly sparse, so fluctuations in performance between runs is pretty normal

Answer 6 · 2022-10-17T22:03:37.000Z

Thank you for confirming!

Answer 7 · 2023-09-29T05:41:24.000Z

Hi @lahavlipson I was able to reproduce your "raftstereo-sceneflow.pth" checkpoint with little differences. However, there's a little bit gap for the realtime model I trained vs. your raftstereo-realtime.pth checkpoint.

Specifically, I constructed the realtime model per your suggestion above -
# model config
hidden_dims=[128] * 3,
shared_backbone=True,
corr_levels=4,
corr_radius=4,
n_downsample=3,
slow_fast_gru=True,
n_gru_layers=2,
train_iters=7, 
valid_iters=7,
freeze_bn=False,
The training is set to be the same as the standard raftstereo setting. The total batch size is 8 on 2 GPUs. And I only trained on the Sceneflow dataset.
# lr scheduler
max_iter = 200000
lr = 1e-4
optimizer = dict(type="AdamW", lr=lr, weight_decay=0.00001)
optimizer_config = dict(grad_clip=dict(max_norm=1))
Anything I might be missing? Any suggestions are greatly appreciated!

Method kitti15 (3px) m-f (2px) m-h (2px) m-q (2px) eth3d (1px)
raft-stereo realtime ckp 5.666 18.005 11.364 8.977 5.751
reproduced 6.249 17.354 11.492 9.846 5.725

Hi, dilinwang820
I cannot reproduce the result of realtime model, my result is much worse than yours and released model. my eth3d d1 is 6.61, 'things' d1 is 15. I found you open batchnorm, is that the point? and about learning rate, I found the scheduler's max lr=2e-4, and div_factor is 25(default), final_div_factor is 1e4(default), did you use this settings? or you just use lr=1e-4 and did not change during the whole training process?
thank you