dyelax/Adversarial_Video_Generation

What do the output images represent?

Closed this issue · 12 comments

screen shot 2018-03-16 at 12 59 19 pm

I have been able to run this code successfully, excellent tutorial by the way. I had few doubts, if I may, I have been saving the output images after every 10 steps. The output image folder has 7 folders from 0 to 7. I wanted to know what exactly do these 7 folders represent? Will the model generate outputs of all the images given in training folder? Im not able to understand what exactly are these.

screen shot 2018-03-16 at 12 47 13 pm

@AdarshMJ Each of those eight folders holds the images for a single sample (ie. the four input images + the generated / ground-truth next frame at each scale for those inputs)

But the input samples are around 200 images. So for all 200 images its going to generate 8 output files?

It just saves each sample in the test batch (which is size 8 in this case) as its own directory. Look at test_batch in g_model.py and where it is called in avg_runner.py

Thank you so much for your patience and taking time to answer my qyuestions. Im sorry If I keep asking trivial questions.

So say I have 36 sub-folders in my Test main folder. And each sub-folder has over 200 image frames. That means I have 7200 images as my test set. So it keeps taking 8 images as its subset from 7200 images and perform frame prediction?

Also how many frames does it predict? By taking 4 input frames its able to predict next frame? That is the fifth frame?

screen shot 2018-03-17 at 8 35 04 am

Also I want to know whether these steps that Im saving correspond to just one image of the test set or does it take 8 different images everytime?

When running the code in test-only mode, the model takes 4 input images and generates 1 next frame. Is there a way to generate more than 1 next-frames?

If you have 36 folders in your Test folder, that means you have 36 videos. The test_batch function will pick test_batch_size (in this case, 8) of those videos at random, and pull one input sample (4 consecutive input images) from each. It then uses each sample to predict the next (fifth) frame for each. It writes the inputs and predicted outputs for each sample to one of those 0, 1, ..., 7 directories.

You can specify how many next frames to predict using the --recursions flag.

Thank you so much. I have a much clearer understanding now. But I wanted to ask this, how will the model converge? I mean should it run some 50000 steps and it stops or what is the criteria for convergence?

On the Ms. Pac-Man dataset that I used, it took about 500,000 steps to converge. It looks like you are using real-world video, so I imagine convergence will be different (and you'll probably need different hyperparameters than I used). There's no set criteria for convergence. Just watch the tensorboard loss graph / the test image outputs and stop training when it looks like it isn't improving anymore.

I wanted to know whether the future frames will be predicted for any given frame or will it depend on the training data? Like for example, I have a training set which has only normal frames, and for testing the network I have given frames of anomaly, will the network predict normal version of the anomalous frames or is it capable of generating the future frames of the anomaly ridden frames?

As with all machine learning, performance on new data (your test set) is completely dependent on what is able to be learned from your training data. It's hard to say without knowing what the differences between your normal and anomaly data are. The closer the anomalies are to what the model has seen before, the more accurate the test results will be. The best way to find out is to test it yourself.

I'm going to close this issue, but feel free to email me (matthew_cooper@brown.edu) with any other questions.