DIVA IO Package

Version 0.3

Author: Lijun Yu

Email: lijun@lj-y.com

IO interfaces for the DIVA project.

Version History

0.3
- Optimized random access and fix missing.
- Robustness improvement.
- Speed test.
0.2 (Deprecated)
- Real random access in video loader.
- Add annotation converter.
- Warning control option.
0.1
- Initial release of video loader.

Installation

Integration

To use as a submodule in your git project, run

git submodule add https://github.com/Lijun-Yu/diva_io.git

Requirements

Environment requirements are listed in environment.yml. For the av package, I recommend you install it via conda by

conda install av -c conda-forge

as building from pip would require a lot of dependencies.

Video Loader

A robust video loader that deals with missing frames in the MEVA dataset.

This video loader is developed based on PyAV package. The pims package was also a good reference despite its compatibility issue with current PyAV.

For the videos in the MEVA, using cv2.VideoCapture would result in wrong frame ids as it never counts the missing frames. If you are using MEVA, I suggest you change to this video loader ASAP.

Replace `cv2.VideoCapture`

According to my test, this video loader returns the exact same frame as cv2.VideoCapture unless missing frame or decoding error occured. To replace the cv2.VideoCapture objects in legacy codes, simply change from

import cv2
cap = cv2.VideoCapture(video_path)

from diva_io.video import VideoReader
cap = VideoReader(video_path)

VideoReader.read follows the schema of cv2.VideoCapture.read but automatically inserts the missing frames while reading the video.

Iterator Interface

video = VideoReader(video_path)
for frame in video:
    # frame is a diva_io.video.frame.Frame object
    image = frame.numpy()
    # image is an uint8 array in a shape of (height, width, channel[BGR])
    # ... Do something with the image

Random Access

Random access of a frame requires decoding from the nearest key frame (approximately every 60 frames for MEVA). Averagely, this introduces a constant overhead of 0.1 seconds, which is much faster than iterating from the beginning.

start_frame_id = 1500
length = 100
video.seek(start_frame_id)
for frame in video.get_iter(length):
    image = frame.numpy()
    # ... Do something with the image

Video Properties

video.width # cap.get(cv2.CAP_PROP_FRAME_WIDTH)
video.height # cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
video.fps # cap.get(cv2.CAP_PROP_FPS)
video.length # cap.get(cv2.CAP_PROP_FRAME_COUNT)

Other Interfaces

For other usages, please see the comments in video/reader.py.

Speed

See speed.md.

Annotation

An annotation loader and converter for Kitware YML format in meva-data-repo.

Clone the meva-data-repo and set

annotation_dir = 'path/to/meva-data-repo/annotation/DIVA-phase-2/MEVA/meva-annotations'

Convert Annotation

This is to convert the annotation from Kitware YML format to ActEV Scorer JSON format. Run the following command in shell outside the repo's director,

python -m diva_io.annotation.converter <annotation_dir> <output_dir>

Read Annotation

from diva_io.annotation import KitwareAnnotation
video_name = '2018-03-11.11-15-04.11-20-04.school.G300'
annotation = KitwareAnnotation(video_name, annotation_dir)
# deal with annotation.raw_data

KevinKecc/diva_io