jmliu206/LIC_TCM

The problem of training

Closed this issue · 4 comments

Is "MSE LOSS is 0.001 or 0.000 from epoch 0 and 16000/183885 (9%)" right?

My training set is 180k+, testing set is 50+.

when set lambda as 0.05 :
Test epoch 0: Average losses: Loss: 2.970 | MSE loss: 0.001 | Bpp loss: 0.79 | Aux loss: 37.53
Test epoch 1: Average losses: Loss: 2.344 | MSE loss: 0.000 | Bpp loss: 0.74 | Aux loss: 8.57

Is this situation reasonable??

Thanks a lot!

I think this is reasonable because only three decimal places are shown here, you can modify the code to show more digits.

Thanks a lot!

I have another question:

I also have encountered the problem of "super().init(entropy_bottleneck_channels=N)
TypeError: init() got an unexpected keyword argument 'entropy_bottleneck_channels'", and i have seen that the solution is "the version of compressai : 1.2.0"
But i have changed the code with "super().init()" NOT "super().init(entropy_bottleneck_channels=N)", and the version of compressai is still 1.2.4.
And i found that the training time for one epoch is around 7 hours on single 4090. There is an efficient way to save time and keep the accuracy of model.
I wonder if i am wrong to deal with this problem.

Thanks a lot!!

You are correct. In version 1.2.4, you can directly use super().init(). However, I am not quite sure if this will affect the final result, as I have not conducted experiments on this version. Intuitively, it should not have an impact.

Regarding the training time, you may try the following methods:

  1. You can first train a model with a higher bitrate and then fine-tune it to obtain models for other bitrate points based on this one (or the pre-trained model I provided). This will speed up the convergence.

  2. Set a smaller N to reduce the model size. In cases of low bitrate, a smaller N can also achieve satisfactory results.

Thanks again!