Reproducing results on LEVIR-CD+

Question

Reproducing results on LEVIR-CD+

Opened this issue 7 months ago · 5 comments

I'm attempting to reproduce the paper results on LEVIR-CD+ & I note you trained for 100 epochs and apparently didn't use early stopping. Based on my training run it appears learning plateaus after 30-40 epochs, is this what you saw?

At 100 epochs the metrics are short of those in the paper:

 'test_f1': 0.75
 'test_iou': 0.60

Possible diff in the dataset used in the paper vs this version?

Answer 1 · 2023-11-28T19:31:55.000Z

Hello @robmarkcole.

As you pointed out the dataset I have used in my experiments was LEVIR-CD, which has 637 image pairs. The LEVIR-CD+ is a different dataset and it has 985 image pairs.
Considering the difference in cardinality in the two datasets I would tell you that I expect fewer epochs to be needed for the model to converge and avoid overfitting. In fact, I seem to see that the metrics are good after 30/40 epochs. I have never experimented on LEVIR-CD+, but you can probably tune the optimizer parameters for better performance.

I see something interesting from the train_loss graph: there is a spike which could mean there has been some catastrophic forgetting of pre-trained weights on ImageNet. It might be interesting to train without ImageNet weights to see if this phenomenon disappears and if you can still achieve the same performance with a sufficient number of epochs. If you give it a try let me know how it goes :)

Answer 2 · 2023-11-29T10:13:21.000Z

@AndreaCodegoni at your suggestion I have run training with

    pretrained=False,
    freeze_backbone=False,

The logs look similar:

And test metrics still poor:

  'test_f1': 0.6987901329994202,
  'test_iou': 0.5370310544967651}]

The model rarely predicts the change class:

I've noticed that my normalisation is slightly different, rather than bandwise norm you have performed I am normalising 0-1 range. However as not using imagenet weights this shouldn't be an issue?

Is TinyCD particularly sensitive to learning rate etc?

Answer 3 · 2023-11-29T13:02:13.000Z

Given the results without pre-training, I would go back to using ImageNet weights and try using ImageNet normalization for mean and variance as well (and not just rescaling between 0 and 1).

As for hyperparameters such as learning rate etc. the answer is yes. As also reported in the paper and in the code, optimizing these parameters led to improvements. In the original work they actually improved by a few points, but if you have the opportunity I recommend you try.

Answer 4 · 2023-11-30T16:29:16.000Z

Having validated my training code in #18 and now performing imagenet normalisation I have re-run the training for 100 epochs. The metrics are slightly worst this time at

  'test_f1': 0.705
  'test_iou': 0.544

Some predictions are encouraging

Since this is a larger dataset it may require more epochs, as well as hyperparam optimisation as you suggest

Answer 5 · 2023-12-02T08:56:36.000Z

Hi @robmarkcole, sorry for the late reply.

I seem to see that the loss in training is still quite unstable.
Probably on larger datasets TinyCD needs to be trained with some more care as you are suggesting.

I don't know what the distribution of images is like in the dataset, i.e. how many pairs with changes there are vs pairs without changes, but that could be a factor too. Maybe you can try training with a larger batch size or even train batches with a certain ratio between images with and without changes.