Googolxx/STF

Getting weird results with STF, need help !

Closed this issue · 6 comments

Hi !
Thank you for the great work !

I'm having some difficulties to reproduce the results on other datasets though.... And I could really use your help !

I trained STF (transformer version) on Waymo & BDD100K opensource datasets (imgs from front cam of an autonomous vehicle, 180000 imgs in total), for the almost same lambda values [0.0009, 0.0018, 0.0035, 0.0067, 0.013, 0.025, 0.0483] using MSE Loss for 200 epochs each. Using Weight&Bias app, the training results looks fine (see graph below). Don't they ?

Capture d’écran 2022-09-25 à 15 43 58

And when I test my new weights on Waymo I get weird results...
Indeed : (see the graph below) ...
- on cropped (256, 256) images (just like the validation on during training), the results make sense (green curve)
- on full size images (with the appropriate padding, just like your compressai.utils.eval_model code), the results don't make sense (red curve)
- the black curve is the results I get on Waymo using the weights trained with lambda = [0.0035, 0.025] on OpenImages that you generously provide

I have done those test multiple times and I always get the same strange results...

Capture d’écran 2022-09-25 à 15 32 13

Do you have any idea where the bug can come from ?? Is it the training datasets ? Why the cropping change the results ?

I would be extremely grateful for your help !!!

Hi !
The loss curves look fine during training, but get weird results on 1920x1280 images while testing, especially at low bpp.
I have not tried on specific data domain like Waymo or BDD100K. But I guess intuitively there is a domain gap between train(256x256) and test(1920x1280 in Waymo), as all the images are from a fixed angle.
How about trying to resize the original images to a low resolution while training ?And have you ever tested on other datasets?

Supplementally, the training data is randomly cropped to 256x256, and the val data is centrally cropped to 256x256.

Thanks for answering !

I have also tested on Kodak dataset (but there's only 25 images...) :

  • black curve : your opensource weights (trained on OpenImages) on waymo
  • red curve : my weights (trained on Waymo+BDD100K) on waymo
  • bleu curve : your results on kodak from your article
  • gray curve : my weights (trained on Waymo+BDD100K) on kodak
    Untitled

By "trying to resize the original images to a low resolution while training" you mean resize Waymo&BDD100K images to, lets say 480x320 for Waymo & 640x360 for BDD100K, instead of RandomCrop(256) while training ?

Such as apply the transforms like "train_transforms = transforms.Compose(
[transforms.Resize([960, 640]), transforms.RandomCrop(256), transforms.ToTensor()])" instead of

STF/train.py

Lines 277 to 279 in 0e24804

train_transforms = transforms.Compose(
[transforms.RandomCrop(args.patch_size), transforms.ToTensor()]
)

Ok I'll try that. Thanks !

ld-xy commented

can you try transfer onnx? thinks