the size of datasets of video_encoder

Question

the size of datasets of video_encoder

Closed this issue 9 months ago · 3 comments

Hello, when using the CIFAR dataset with 1920*1080 resolution YUV 4:2:0 video data and encoding it with the provided script for training, the nn_bpp remains zero, causing the loss function to stay constant and preventing convergence.

Answer 1 · 2024-04-07T13:24:55.000Z

Sir.This picture is the printed result of the predicted frame in the code. This is a screenshot of the training process using YUV420 video with 1920*1080 resolution. The value part of this content is all 1. This causes an MSE calculation error and causes the loss to fail to converge. Sorry to bother you, but I can't reproduce the results in your original article.

Answer 2 · 2024-04-08T10:53:05.000Z

Hello,

nn_bpp refers to the rate associated to the different neural networks. As explained in the COOL-CHIC paper, it is not optimized by the training process, unlike the rate associated to the latent variable rate_latent_bpp, so that should not be an issue.

Please join the sequence you'd like to encode and the command line you're using, I'll try it on my side to see what's wrong.

Théo

Answer 3 · 2024-04-08T11:05:28.000Z

Thank you for your answer，This problem has been solved, sorry to disturb you during the break.