Demo on video

Question

Demo on video

sophia-wright-blue opened this issue 6 years ago · 11 comments

sophia-wright-blue commented 6 years ago

Hello,

Thank you for releasing the code. On the home page, you have a demo on a video (two people talking from the Big Bang Theory). In the README, you have instructions for Demo/Test on our own images.

Could you guide me on the process to do a demo on my own video? what are the steps to obtaining the results on an mp4 file?

Thank you,

Answer 1 · 2018-10-14T23:17:20.000Z

Hi,

In order to test on your own images, you have to follow all steps in the Demo/Test on your own images section. Those are:

Clone and setup the tf-faster-rcnn repository.
Convert your video to PNG sequences and put them in demo/ folder.
Detect all objects
Detect all HOIs
Step 3 will save Object_Detection.pkl in the demo folder, step 4 will save HOI_Detection.pkl in the demo folder. Then you can use tools/Demo.ipynb to visualize all the HOI detections.

Hope this helps.

Answer 2 · 2018-10-14T23:27:47.000Z

Thank you for the quick reply @gaochen315 ,to convert the video to PNG sequences, would I have to use something like ffmpeg?

for the result, will tools/Demo.ipynb give the output as the HOI interactions on the video?

I'd like to be able to input an mp4 file and get the HOI interactions on the mp4 file, exactly as you have demonstrated on the home page and on the project page.

thank you

Answer 3 · 2018-10-14T23:33:49.000Z

Yes. You need to use ffmpeg for the conversion.

tools/Demo.ipynb will plot the images with interactions annotated. You can save the visualization instead of plotting them, and then convert the PNGs back to video.

Answer 4 · 2018-10-14T23:35:15.000Z

ah, thank you @gaochen315 i get it now! so I guess I should use ffmpeg to convert the PNGs back to video?

Answer 5 · 2018-10-14T23:38:26.000Z

ffmpeg as well. The following function will do the job
ffmpeg -r 25 -f image2 -i pic%03d.png -pix_fmt yuv420p HOI.mp4

Answer 6 · 2018-10-14T23:39:08.000Z

thank you so much for the super fast reply @gaochen315 ! Looking forward to using the repo.

Answer 7 · 2018-10-14T23:43:36.000Z

apologize for reopening the issue @gaochen315 . a related question - if the HOI interaction output is done only on the individual frames, does the temporal aspect of the video get considered? for example, 3D CNNs or CNN+RNN models consider the spatial and temporal aspects of videos. I hope my question is clear

Answer 8 · 2018-10-14T23:48:08.000Z

Currently, we only care about frame-level interaction detection. However, jointly considering temporal information is definitely a correct direction and it's worthy to explore.

Answer 9 · 2018-10-14T23:49:14.000Z

got it, if its not too much trouble, could you point me to a paper or two that does consider the temporal information as well?

Answer 10 · 2018-10-14T23:50:49.000Z

https://github.com/jinwchoi/awesome-action-recognition

Answer 11 · 2020-05-20T10:07:05.000Z

@sophia-wright-blue
Hi, I just read your conversation. Have you successfully done on with video detection?