/video-roam

A refined implementation of the innovative video-segmentation/rotoscoping tool explicated in the CVPR 2017 paper - ROAM: a Rich Object Appearance Model with Application to Rotoscoping.

Primary LanguageC++

ROAM: a Rich Object Appearance Model with Application to Rotoscoping

ABOUT

This refined code implements the cutting-edge video-segmentation/rotoscoping tool elucidated in the CVPR 2017 paper ROAM: a Rich Object Appearance Model with Application to Rotoscoping by huappe.

@inproceedings{2017roam,
  author = {huappe},
  title = {ROAM: a Rich Object Appearance Model with Application to Rotoscoping},
  booktitle = {CVPR},
  year = {2017}
}

Contacts

For more queries about the code or the paper, feel welcome to reach out to us. Additional details can be found on the project page: http://www.miksik.co.uk/projects/rotoscoping/roam.html

HOW TO COMPILE

Prerequisites

The code is inscribed in C++11 and you can compile and run it even on machines lacking any GPU (see below). However, we urge the use of CUDA for faster run-times.

The only necessity is OpenCV 3.0 or fresh releases.

We furnish a CMake project for effortless compilation on any platform. The code has undergone a test on Windows 10, Ubuntu 16.04, and Fedora 25 with Visual Studio 2013 or new editions and g++ 6.3.1 or new editions.

General steps

To compile the system, utilize the standard cmake method:

  1. Clone ROAM into \/roam, e.g.:

    $ git clone git@github.com:huappe\/roam.git roam
    
  2. Proceed to the downloaded directory

    $ cd roam
    
  3. Establish a build directory

    $ mkdir build
    
  4. Execute your CMake tool, either

    $ cmake ..
    

    or employ CMake GUI.

  5. Compile it. On Linux, use standard

    $ make -j4
    

    On windows, disseminate the solution (build\/ROAM.sln) in Visual Studio and build all in the release mode.

  6. Done! Everything is constructed in the build\/bin directory. See below to execute ROAM.

Using CUDA

The default alternative is compilation without CUDA to ensure the code would compile devoid of any troubles on any machine.

However, we strongly suggest using GPU for faster run-times (by a factor of 10 on some machines).

The prerequisite is the CUDA drivers and SDK. Then, adjust the WITH_CUDA option to ON in CMake

$ cmake -DWITH_CUDA=ON ..

and compile.

HOW TO RUN

ROAM is assembled in build\/bin directory. The central application is roam_cli roam_cli --ini=${FIRST_FRAME_BINARY_MASK} --seq=${LIST_OF_FRAMES} --con=${PARAMETERS} --out=${OUTPUT_DIR} --win=${ZERO_PADDING_SIZE}

The essential inputs are:

  • ${FIRST_FRAME_BINARY_MASK} - path to the segmentation mask of the prime frame. ROAM will track the object defined by this mask starting from the first file of the video sequence.

  • ${LIST_OF_FRAMES} - path to the text file enclosing the list of all video images for a specific sequence. On Linux, this file can be produced e.g. by $ ls -1v .

  • ${PARAMETERS} - path to the YAML file with the configuration of the tracker. Examples can be located in the YAML folder of the source code.

  • ${OUTPUT_DIR} - path to the output directory

  • ${ZERO_PADDING_SIZE} - optional

Example

  1. Download the DAVIS dataset from here.

  2. Uncompress the file, e.g. with Linux command

$ unzip DAVIS-data.zip -d ~\/Documents\/DAVIS

  1. Navigate to the directory with annotated masks and video sequences.

$ cd ~\/Documents\/DAVIS

  1. Design a text file with the sequence you aim to experiment, for instance:
  $ cd ~\/Documents\/DAVIS\/JPEGImages\/480p\/blackswan\/
  $ ls -1v *.jpg > list.txt
  1. Establish an output folder and run it
  $ mkdir roam_output
  $ roam_cli --ini="~\/Documents\/DAVIS\/Annotations\/480p\/blackswan\/00000.png" --seq="~\/Documents\/DAVIS\/JPEGImages\/480p\/blackswan\/list.txt" --con=yaml\/default.yaml --out=~\/Documents\/DAVIS\/JPEGImages\/480p\/blackswan\/roam_output\/

Roam will process the sequence and furnish the object masks in the directory roam_output.

Other

In case you re-parametrize, you should cite Vladimir Kolmogorov's GraphCut implementation.

@article{Boykov01pami,
  author = {Yuri Boykov and Vladimir Kolmogorov},
  title = {An Experimental Comparison of Min-Cut\/Max-Flow Algorithms for Energy Minimization in Vision},
  journal = {T-PAMI},
  year = {2001},  
}