dyelax/Adversarial_Video_Generation

avg_runner.py stops after few iterations

Opened this issue · 3 comments

I'm trying to use your code but I get a strange error with the avg_runner.
it seems like it has some issue with the size of some images.
this is however very unlickly as I exported all the images from a video with ffmpeg.

any idea about why it breaks?

gino:Code Lorenzo$ python2.7 avg_runner.py -n Test -s 1000 --model_save_freq=1000 --test_freq=1000
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Init discriminator...
Init generator...
Init variables...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
DiscriminatorModel: step 10 | global loss: 0.603209
Training generator...
GeneratorModel : Step  10
                 Global Loss    :  540.297
                 PSNR Error     :  5.45025
                 Sharpdiff Error:  4.7847
Training discriminator...
Training generator...
Training discriminator...
Training generator...
Training discriminator...
Traceback (most recent call last):
  File "avg_runner.py", line 185, in <module>
    main()
  File "avg_runner.py", line 181, in main
    runner.train()
  File "avg_runner.py", line 70, in train
    self.d_model.train_step(batch, self.g_model)
  File "/Users/Lorenzo/development/1-frame-in-the-future/Code/d_model.py", line 171, in train_step
    feed_dict = self.build_feed_dict(input_frames, gt_output_frames, generator)
  File "/Users/Lorenzo/development/1-frame-in-the-future/Code/d_model.py", line 132, in build_feed_dict
    resized_frame = resize(sknorm_img, [scale_net.height, scale_net.width, 3])
  File "/usr/local/lib/python2.7/site-packages/skimage/transform/_warps.py", line 119, in resize
    preserve_range=preserve_range)
  File "/usr/local/lib/python2.7/site-packages/skimage/transform/_geometric.py", line 1296, in warp
    image = _convert_warp_input(image, preserve_range)
  File "/usr/local/lib/python2.7/site-packages/skimage/transform/_geometric.py", line 1108, in _convert_warp_input
    image = img_as_float(image)
  File "/usr/local/lib/python2.7/site-packages/skimage/util/dtype.py", line 301, in img_as_float
    return convert(image, np.float64, force_copy)
  File "/usr/local/lib/python2.7/site-packages/skimage/util/dtype.py", line 205, in convert
    raise ValueError("Images of type float must be between -1 and 1.")

Hi Lorenzo,

Does this happen consistently on a specific frame? Have you tried looking at the frame values when it breaks?

It actually seems to happen on multiple frames.
probably there are some issues with the image normalization procedure.

I temporarily solved it by getting rid of the +0.5, -0.5 transposition in the d_model.py

# sknorm_img = (img / 2) + 0.5
# resized_frame = resize(sknorm_img, [scale_net.height, scale_net.width, 3])
# scaled_gt_output_frames[i] = (resized_frame - 0.5) * 2

sknorm_img = (img / 2)
resized_frame = resize(sknorm_img, [scale_net.height, scale_net.width, 3])
scaled_gt_output_frames[i] = (resized_frame) * 2

This seems to have solved the issue and I managed to train and test the network.

However I'm not sure if my dataset has been prepared correctly.
I trained the network with a video of me biking trough my neighbourhood;
from the video-clips I take 3 images per second and use them for the training.

this is the result after just 1000 steps.
video prediction

It seems to me that the generated frames are far closer to the 4th input frame that to the target expected image.

Do you think the frames of the input sequence are too far apart? or I should just let the machine train for way longer?

really appreciate your feedback on this.

I want to train the model with my data,but it didn`t success,may I ask ,how did you train this model with your data?