
An implementation of Video Transformer Network (VTN) approach for Action Recognition in TensorFlow.

Action Recognition

This is the implementation of Video Transformer Network (VTN) approach for Action Recognition in Tensorflow. It contains complete code for preprocessing,training and test. Besides, this repository is easy-to-use and can be developed on Linux and Windows.

VTN : Kozlov, Alexander, Vadim Andronov, and Yana Gritsenko. "Lightweight Network Architecture for Real-Time Action Recognition." arXiv preprint arXiv:1905.08711 (2019).

Getting Started

1 Prerequisites

  • Python 3.x
  • Tensorflow 1.x
  • Opencv-python
  • Pandas

2 Download this repo and unzip it

cd ../VTN/Label_Map
Open the label.txt and revise its class names as yours.

3 Generate directory

cd ../VTN/Code
run python make_dir.py
Then some subfolders will be generated in ../VTN/Raw_Data , ../VTN/Data/Train, ../VTN/Data/Test, ../VTN/Data/Val, where name of the subfolders is your class names defined in label.txt.

4 Prepare video clips

According to the class, copy your raw AVI videos to subfolders in ../VTN/Raw_Data. Optionally, you can use the public HMDB-51 dataset, which can be found here.
cd ../VTN/Code
run python prepare_clips.py
Clips generated will be saved in the subfolders in ../VTN/Data/Train, ../VTN/Data/Test, ../VTN/Data/Val. These clips will be used for training, test and validation.

5 Compute the mean image from training clips(optional)

cd ../VTN/Code
run python mean_img.py
And then a mean image is saved in directory ../VTN/Data/Train.

6 Train model

The model parameters, training parameters and eval parameters are all defined by parameters.py.
cd ../VTN/Code
run python train.py PB or python train.py CHECKPOINT
The model will be saved in directory ../VTN/Model, where "PB" and "CHECKPOINT" is two ways used for saving model for Tensorflow.

7 Test model(pb)

Test model using clips in ../VTN/Data/Test.
cd ../VTN/Code
run python test.py N
Where N is not more than the number of clips in test set. Note that we do not use min-batch during test. There may be out of memory errors with a large N. In this case, you can modify the test.py to use min-batch.

8 Visualize model using Tensorboard

cd ../VTN
run tensorboard --logdir=Model/
Open the URL in browser to visualize model.

① 切换到目录 ../VTN/Label_Map,打开label.txt,将其中已有的类名修改为你的类名和对应的id。


① 切换到目录 ../VTN/Code,然后运行:python make_dir.py,在目录../VTN/Raw_Data../VTN/Data/Train../VTN/Data/Test../VTN/Data/Val 中将会创建子文件夹,文件夹名字为你的类名。


① 根据类别名称,将你自己收集到的原始视频数据(AVI格式)复制到目录 ../VTN/Raw_Data 中对应的文件夹中。
② 切换到目录 ../VTN/Code, 然后运行:python prepare_clips.py,每个类生成的视频片段将会保存在 ../VTN/Data/Train, ../VTN/Data/Test, ../VTN/Data/Val 的子文件夹中,将被用于模型训练、评估和测试。


① 切换到目录 ../VTN/Code,然后运行:python mean_img.py,生成的均值图像将会保存在../VTN/Data/Train 目录下。


parameters.py 中,你可以修改模型参数、训练参数、评估参数,以及生成训练数据的一些参数。
① 切换到目录 ../VTN/Code,然后运行python train.py PB 或者 python train.py CHECKPOINT,参数 "PB" 和 "CHECKPOINT"分别对应Tensorflow保存模型的两种方式。模型保存在 ../VTN/Model中。


使用在 ../VTN/Data/Test 中的视频片段测试模型。
② 切换到目录 ../VTN/Code,然后运行python test.py N,这里N为小于等于测试集中clip的数量的正整数。

8、Tensorboard 可视化模型

① 切换到目录 ../VTN/,执行:tensorboard --logdir=Model/,然后将显示的链接复制到浏览器中打开,可查看模型结构。

