This project is fairly simple and the goal is to generate new dance videos by using various ML techniques. To be specific CNN, RNN and autoencoder is used in this project.
- Dataset used : video
- Download the video ( I used 480p quality video )
- Extract 3 frames per second from the video using ffmpeg (this will result in 14401 frames)
- Crop/Resize the frames to 800x480 for simplicity
- Clear the background of each frame or simply replace each pixel that is not black with white.(you'll understand why when you see the video)(use "remove background.py")
- Train the auto-encoder with new cleaned frames.The encoder output shape is (128).
- After training encode all images using Encoder that will result in new output of shape (14401,128) in which all images are converted into its dense representation
- Train the RNN on new the 14401 sequence of 128 dims data.
- After training predict new sequences from RNN
- Convert new Sequence back into image using Decoder
- Stitch images into a video using ffmpeg
- Check out different results here
Final Result
Model | epoch | acc | loss | val_acc | val_loss |
---|---|---|---|---|---|
Autoencoder | 27 | 0.964 | 0.007 | 0.964 | 0.007 |
RNN | 853 | 0.800 | 0.048 | 0.799 | 0.049 |
- Auto encoder accuracy
- Auto encoder loss
- RNN accuracy
- RNN loss
Dance generated Using Deep Learning
Inspired by carykh
check out his video here : video