/4dm

This is the repository of the paper "Multi-view data capture for dynamic object reconstruction using handheld augmented reality mobiles"

Primary LanguageC#MIT LicenseMIT

Multi-view data capture for dynamic object reconstruction using handheld augmented reality mobiles

This repository contains a system to capture nearly-synchronous frame streams from multiple and moving handheld mobiles that is suitable for dynamic object 3D reconstruction. Each mobile executes Simultaneous Localisation and Mapping (SLAM) on-board to estimate its pose, and uses a wireless communication channel to send or receive synchronisation triggers. We use the SLAM algorithm integrated in Android ARCore. Our system can harvest frames and mobile poses in real time using a decentralised triggering strategy and a data-relay architecture that can be deployed either at the Edge or in the Cloud. We show the effectiveness of our system by employing it for 3D skeleton and volumetric reconstructions. Our triggering strategy achieves equal performance to that of an NTP-based synchronisation approach, but offers higher flexibility, as it can be adjusted online based on application needs.

Paper (pdf)

Modules

This project is divided into two software blocks, the capturing system and the reconstruction software, which in turn are composed on several modules. Specifically,

  • app: Android ARCore-based mobile application to capture the frames
  • data-manager: server to process the captured frames
  • mlapi-server: server to manage synchronision and enrolments of the mobiles
  • reconstruction_sw: Python scripts to perform 3D pose and volumetric reconstructions using the captured data

Getting started

Please check the documentation

The 4DM dataset

This is the 4DM dataset that involves six people recording with their mobiles a person acting table tennis in an outdoor setting. The 4DM dataset is characterised by cluttered backgrounds, cast shadows and people appearing in each other's view, thus becoming likely distractors for object detection and human pose estimation.

4DM is composed of three sequences:

  • 4DM-Easy: all mobiles are stably held by people during capture
  • 4DM-Medium: three out of six mobiles are stably held, the others undergo motion
  • 4DM-Hard: all mobiles undergo motion

The host mobile generates triggers at 10Hz. Frames have a resolution of 640x480 and an average size of about 160KB. The latency between mobiles and the Relay Server was about 5ms.

Download (zip)

Citing our work

Please cite the following paper if you use our code or our dataset:

@article{Bortolon2021,
    title = {Multi-view data capture for dynamic object reconstruction using handheld augmented reality mobiles},
    author = {Bortolon, Matteo and Bazzanella, Luca and Poiesi, Fabio},
    journal = {Journal of Real-Time Image Processing},
    volume = {18},
    pages = {345–355},
    month = {Mar},
    year = {2021}
}

Acknowledgements

This research has received funding from the Fondazione CARITRO - Ricerca e Sviluppo programme 2018-2020.