-
- 1.1 Setup docker image for CoVA
- 1.2. Launch and attach to the docker container
- 1.3. Additional setup steps inside the container
- 2. Running the pipeline
- 2.0. Download video file
- 2.1. Naive DNN-only pipeline
- 2.2. CoVA pipeline
- 2.2.1. Getting BlobNet ONNX file
- 2.2.2. Convert frozen model into TensorRT engine
- 2.2.3 Launch CoVA pipeline
- 2.2.4 Parsing CoVA result
- NVIDIA RTX 3090
- Ubuntu 18.04
- CUDA 11.5.1
- Docker 20.10
- Nvidia container toolkit 1.4.0
git clone --recurse-submodule https://github.com/casys-kaist/CoVA
cd CoVA
# or if you already cloned without submodule,
git submodule update --init --recursive
-
Get
nvcr.io/nvidia/deepstream:6.0-devel
image from NVIDIA NGC -
Get TensorRT 8.2.4.2
DEB
package from NVIDIA webpage and place it inside./docker
-
Build an image on top of deepstream
cd docker # Builds the image for CoVA based on ./docker/Dockerfile ./build.sh
The docker image is not provided in terms of DeepStream LICENSE.
./launch.sh CONTAINER_NAME
The container should be launched with current cloned repository mounted on /workspace
.
All the following steps should be done inside (attached to) the docker container.
cd /workspace
# Download pretrained YOLOv4 weights
pushd third_parties/tensorrt_demos/yolo
./download_yolo.sh
popd
# Build custom Deepstream parser for YOLO
pushd third_parties/DeepStream-Yolo/nvdsinfer_custom_impl_Yolo
CUDA_VER=11.4 make
popd
cd /workspace
# Build modified version of FFmpeg
pushd third_parties/FFmpeg
./configure --enable-shared --disable-static
make -j`nproc` install
popd
# Build GStreamer plugin with modified decoder
pushd third_parties/gst-libav
meson build
ninja -C build install
popd
# Check the plugin is installed correctly.
gst-inspect-1.0 avdec_h264
The entropy decoder is built upon FFmpeg.
Once patched avdec_h264
is installed, it should work as entropy decoder (partial decoder) with the combination of metapreprocess
element.
export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
cd /workspace
# Install all required plugins
make install
metapreprocess
: Preprocess metadata extracted from entropy decoderbboxcc
: Transforms BlobNet mask into bounding box using connected component algorithmsorttracker
: Tracks the bounding boxes using SORT algorithmcova
: Filters frames to decode based on the tracked objects- [For training]
tfrecordsink
: Used to pack BlobNet training data into Tensorflow TFRecord format
gopsplit
: Splits encoded video stream at the GoP boundarymaskcopy
: Copies BlobNet output mask from GPU memory to CPU memorynvdsbbox / tcpprobe
: Extracts inference information fromnvinfer
# Check plugins all correctly installed
gst-inspect-1.0 cova
gst-inspect-1.0 gopsplit
gst-inspect-1.0 maskcopy
gst-inspect-1.0 nvdsbbox
gst-inspect-1.0 tcpprobe
We provide the two video streams for demonstration which are the two first dataset we used in our paper.
You can download them from the following Google drive link.
Provided scripts are written assuming videos are placed under /workspace/data/video/
like the following. So consider placing them like the following:
- /workspace/data/video/amsterdam/day1.mp4
- /workspace/data/video/amsterdam/day2.mp4
- ...
Otherwise, specify the custom path later on.
cd experiment/naive
# e.g., python launch.py /workspace/data/video/archie/day1.mp4 /workspace/baseline/archie/day1
python launch.py INPUT_PATH OUTPUT_DIR
First time running the pipeline will take a while for building TensorRT engine from onnx weight file.
Once the conversion is done, move the created engine file to predefined path so that the engine is directly loaded so that this step can be skipped next time.
mkdir -p /workspace/model/trt_model/rnn
mv model_b2_gpu0_fp16.engine /workspace/model/trt_model/rnn/yolov4_b2_fp16.engine
DNN-only pipeline is required for accuracy comparison of CoVA, but running the pipeline for all dataset we used takes a lot of time, so consider downloading the result from the following Google drive link.
Provided scripts are written assuming baseline results are placed under /workspace/data/baseline/
like the following.
- /workspace/data/baseline/amsterdam/day1/dnn.csv
- /workspace/data/baseline/amsterdam/day2/dnn.csv
- ...
Otherwise, specify the custom path later on.
-
Download the pretrained model from the following Google drive link.
-
Place the downloaded file under
/workspace/model/onnx_model/blobnet/
-
Move on to 2.2.2. Convert frozen model into TensorRT engine.
# e.g., ffmpeg -i original.mp4 -to 0:20:00 -c:v copy train.mp4
ffmpeg -i INPUT_VIDEO -to TRAIN_DUR -c:v copy OUTPUT_VIDEO
cd /workspace/utils
# Generate MoG background subtraction based foreground mask from the video
./generate-mog.py VIDEO_PATH MOG_PATH
cd /workspace/utils
# Extracts compressed metadata from video and packs (metadata, MoG label) pairs into TFRecord format dataset
./generate-record.sh VIDEO_PATH MOG_PATH RECORD_PATH
cd /workspace/utils
# Train BlobNet with Tensorflow and save it as frozen model
./train-blobnet.py RECORD_PATH FROZEN_PATH
Place the output frozen model directory under /workspace/model/tf_model/blobnet
.
cd /workspace/model
# The following command will generate onnx file
# From /workspace/model/tf_model to /workspace/model/onnx_model
python -m invoke tf2onnx FROZEN_PATH
cd /workspace/model
# The following command will generate engine file
# From /workspace/model/onnx_model to /workspace/model/trt_model
python -m invoke onnx2trt ONNX_PATH
cd /workspace/experiment/cova
# e.g., python launch.py /workspace/data/video/amsterdam/day1.mp4 /workspace/data/cova/amsterdam/day1 amsterdam
python launch.py INPUT_PATH OUTPUT_DIR DATASET
You can configure number of entropy decoder / number of concurrent models / number of model batch size in the config.yaml
.
The structure of resulting output directory is as the following.
output_dir/
track.csv (debugging purpose): Tracked objects in compressed domain
dnn.csv (debugging purpose): Inferenced detection in pixel domain
assoc.csv: Final CoVA results of moving objects
stationary.csv: Final cova results of stationary objects
out.txt: Logs filtering rates and elapsed time
You can use htop
and nvidia-smi dmon
to confirm the pipeline is running correctly by monitoring the CPU, memory, GPU SM, and NVDEC utilization.
cd /workspace/parse
# e.g., python accuracy.py amsterdam /worksapce/data/parsed/amsterdam
python accuracy.py DATASET OUTPUT_DIR
As a result, two files will be created in the OUTPUT_DIR
which contain the result of binary predicate query of the target object. You can check the video at the returned timestamp (showed in nanosecond) to find the object appearing.
The main results for elapsed time (for Figure 8), filtering rate (for Table 3) and accuracy metric (for Table 4) will be provided on the stdout.
- Pixel Domain FG Mask Extraction: MoG based object detection
- Compressed Domain Mask Extraction: BlobNet based object detection
If you have any issue while running the script, please file an issue on the GitHub page or let us know by email (contact: jwhwang@casys.kaist.ac.kr and we will investigate and fix the issue.