JunweiLiang/Semantic_Features

Given a list of videos, output the semantic features for each video.

Python

CMU Video-Level Semantic Features

This repository contains the inferencing code and models for the following papers:

Po-Yao Huang, Ye Yuan, Zhenzhong Lan, Lu Jiang, and Alexander G. Hauptmann
"Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification"
in arXiv preprint arXiv:1707.01408 (2017).

Jia Chen, Junwei Liang, Jiang Liu, Shizhe Chen, Chenqiang Gao, Qin Jin, and Alexander Hauptmann. 
"Informedia@ Trecvid 2017."

Dependencies

Python 2.7; TensorFlow >= 1.4.0; tqdm and nltk (for preprocessing)
Pre-trained models (1.1GB). Extract the models into path models/

What the code does

Given a list of videos, output the semantic features for each video.

Inferencing

Extract frames from each video

$ python extractFrames_uniform.py videos.lst frames_path --num_per_video 30

Extract frame-level CNN features

First change the slimpath in img2feat_utils.py

$ python img2feat.py frames_path.lst inception_resnet_v2 models/inception_resnet_v2.ckpt frame_feature_path --l2norm --batchSize 30

Average pooling into video-level CNN features

$ python averageFeats.py frame_feature_path.lst video_feature_path --l2norm

Extract semantic features

First change the modelpath in semantics_features.py

$ python semantics_features.py video_feature_path.lst semantic_feat --save_seperate