For an input video, this project will show attention map in video and frames.
Video can't be show here, there are some gif.
Video with Clip_step 1
Video with Clip_step 4
Video withClip_step 16
heatmap
focus map
In some case, the real label of video/action can't access. We average all filters and visualize the heatmap.
- pytorch0.4
- opencv
- numpy
- skvideo
git clone https://github.com/FingerRec/3DNet_Visualization.git
cd 3DNet_Visualization
mkdir pretrained_model
download pretrained MFNet on UCF101 from google_drive and put it into directory pretrained_model, which is from MFNet
pretrained I3d on HMDB51
bash demo.sh
The generate video and imgs will be put in dir output/imgs and output/video.
Tip: in main.py, if set clip_steps is 1, will generate a video the same length as origin.
the details in demo.sh as follow, change --video and --label accorading to your video, please refer to resources/classInd.txt for label information for UCF101 videos.
python main.py --num_classes 101 \
--classes_list resources/classInd.txt \
--model_weights pretrained_model/MFNet3D_UCF-101_Split-1_96.3.pth \
--video test_videos/v_ApplyEyeMakeup_g01_c01.avi \
--frames_num 16 --label 0 --clip_steps 16 \
--output_dir output
Tip:UCF101/HMDB51 dataset is support now, for Kinetics et al. Just download a pretrained model and change --classes_list
- support i3d, mpi3d
- support multi fc layers or full convolution networks
- support feature map average without label
- support s3d, Slow-Fast Net and c3d
- visualize filters
- grad-cam
This project is highly based on SaliencyTubes , MF-Net and st-gcn.