/HISNav

HISNav - Habitat-based Instance segmentation, Slam and Navigation Dataset

Primary LanguagePythonMIT LicenseMIT

HISNav

HISNav - Habitat-based Instance segmentation, Slam and Navigation Dataset

The dataset is available here.

Models

For your convenience, we provide the following trained models on HISNav.

Model Testing time / im mAP (IoU=0.5) Link
BlendMask 20ms 0.3648 download
SOLOv2 18ms 0.3778 download
YOLACT++ 26ms 0.2739 download
MASK R-CNN(det2) 31ms 0.3483 download
MASK R-CNN(mmdet) 34ms 0.3593 download

Paper

Real-Time Object Navigation With Deep Neural Networks and Hierarchical Reinforcement Learning, Staroverov, A., Yudin, D. A., Belkin, I., Adeshkin, V., Solomentsev, Y. K., & Panov, A. I. IEEE Access, 8, 195608-195621.

Data

HISNav is a dataset, which consists of various robot movements tracks, recorded in virtual environment Habitat. Tracks were built on 49 unique scenes from Matterport3D that present rooms with different styles. Each scene has no more than 5 trajectories with 3 different levels of noise in camera images and in actions.

We pursue the goal to research the steadiness of the developed framework to the noise. We use three levels of noise in images: without noise, light Gaussian noise , strong Gaussian noise. The examples of images from the dataset are shown here:

Solov2's predictions:

Each RGB image has a resolution 640x320, and the depth map has the same resolution. Each pixel contains a distance value in meters (from 0 to 100m). Ground truth instance labels of 40 classes (wall, floor, chair, door, table, sofa, etc.) correspond to each image.

All the dataset includes 135962 images and is split ted into three parts: train, val and test. Information about splitted samples can be found in Table. While splitting into samples a goal of diversity and balance between training, validation and test samples was pursued.

train val test total
Number of images 72626 27952 35384 135962
Number of unique scenes 49 35 43
Number of tracks 88 35 43

The dataset is distributed in hdf5 file format. To extract the data in TUM format for the purpose of evaluation Visual SLAM methods we provide a script tools/HISNav_to_TUM.ipynb

Visualize the results

python tools/vis_pred_json.py