This dataset provides synthetic and real outdoors sequences especially recorded for place recognition applications using both flying and hand-held setups. This dataset was made publicly available with the paper "Real-time Wide-baseline Place Recognition using Depth Completion", by Fabiola Maffra, Lucas Teixeira, Zetao Chen and Margarita Chli, published at IEEE Robotics and Automation Letters (RA-L) 2019 [paper].
If you use this dataset, please cite the following publication:
@article{maffra2019real,
title={Real-time Wide-baseline Place Recognition using Depth Completion},
author={Maffra, Fabiola and Teixeira, Lucas and Chen, Zetao and Chli, Margarita},
journal={IEEE Robotics and Automation Letters},
year={2019},
publisher={IEEE}
}
The L'Agout synthetic dataset was produced using aerial pictures of “Maisons sur l’Agout”, depicting medieval houses with balconies over the river Agout. We produce 4 sequences of 100 meters with a laterally moving drone carrying a camera facing the houses at 0° (i.e. pointing forwards), 15° from the horizon, 30°, and 45°. It is important to highlight that the position of the drone was chosen in a way that the camera frustum is completely filled by the buildings in order to guarantee that the only difference between these sequences happens in the viewpoint, without any changes in scale. By combining sequences at different angles, large changes in viewpoint can be simulated and a very challenging place recognition dataset is created.
L'Agout Sequence 0° - Bagfile - Youtube
L'Agout Sequence 15° - Bagfile - Youtube
L'Agout Sequence 30° - Bagfile - Youtube
L'Agout Sequence 45° - Bagfile - Youtube
The ground truth for each dataset was automatically annotated using 2 different sequences, a query sequence and a reference sequence. The sequences used as reference and query for the available ground truth are:
-
L'Agout Sequence 0° & Sequence 15°:
- Reference Dataset: L'Agout Sequence 0°
- Query Dataset : L'Agout Sequence 15°
-
L'Agout Sequence 0° & Sequence 30°:
- Reference Dataset: L'Agout Sequence 0°
- Query Dataset : L'Agout Sequence 30°
-
L'Agout Sequence 0° & Sequence 45°:
- Reference Dataset: L'Agout Sequence 0°
- Query Dataset : L'Agout Sequence 45°
More details about the ground truth are available on the links below.
L'Agout Sequence 0° & Sequence 15° - Ground truth
L'Agout Sequence 0° & Sequence 30° - Ground truth
L'Agout Sequence 0° & Sequence 45° - Ground truth
The visual-inertial data reproduces the Skybotix VI-Sensor with the same resolution of the real datasets. Below are the calibration parameters used in the data generation. Note that T_SC is the transformation from the Camera to the Sensor (IMU).
- {T_SC:
[1.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0],
image_dimension: [752, 480],
distortion_coefficients: [0.0, 0.0, -0.0, 0.0],
focal_length: [455.0, 455.0],
principal_point: [376.5, 240.5]}
The Corvin dataset was produced using aerial footage of the Corvin Castle visible. We produced 3 sequences at 0° , 30° , and 45°, while doing a 300-meter circular flight around the castle. These sequences capture a scene composed of a large range of different depths. By combining sequences at different angles, large changes in viewpoint can be simulated and a very challenging place recognition dataset is created.
Corvin Sequence 0° - Bagfile - Youtube
Corvin Sequence 30° - Bagfile - Youtube
Corvin Sequence 45° - Bagfile - Youtube
The ground truth for each dataset was automatically annotated using 2 different sequences, a query sequence and a reference sequence. The sequences used as reference and query for the available ground truth are:
-
Corvin Sequence 0° & Sequence 30°:
- Reference Dataset: Corvin Sequence 0°
- Query Dataset : Corvin Sequence 30°
-
Corvin Sequence 0° & Sequence 45°:
- Reference Dataset: Corvin Sequence 0°
- Query Dataset : Corvin Sequence 45°
More details about the ground truth are available on the links below.
Corvin Sequence 0° & Sequence 30° - Ground truth
Corvin Sequence 0° & Sequence 45° - Ground truth
The visual-inertial data reproduces the Skybotix VI-Sensor with the same resolution of the real datasets. Below are the calibration parameters used in the data generation. Note that T_SC is the transformation from the Camera to the Sensor (IMU).
- {T_SC:
[1.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0],
image_dimension: [752, 480],
distortion_coefficients: [0.0, 0.0, -0.0, 0.0],
focal_length: [455.0, 455.0],
principal_point: [376.5, 240.5]}
Two sequences were recorded at the end of the day in the old city of Zurich, exhibiting not only small, but also challenging viewpoint variations due to the presence of narrow passages in this area, providing wide range of viewpoints of the same places. This dataset comprises two traverses along the same route, each one covering a distance of approximately 230 meters. In total, 10 minutes of data were recorded for this dataset.
Old City Sequence 1 - Bagfile - Youtube
Old City Sequence 2 - Bagfile - Youtube
The ground truth for each dataset was automatically annotated using 2 different sequences, a query sequence and a reference sequence. The sequences used as reference and query for the available ground truth are:
-
Old City Sequence 1 & 2:
- Reference Dataset: Old City Sequence 1
- Query Dataset : Old City Sequence 2
More details about the ground truth are available on the link below.
Old City Sequence 1 & 2 - Ground truth
The images were captured using a VI-Sensor and calibrated using ETHZ ASL Kalibr. Below are the calibration parameters. Note that T_SC is the transformation from the Camera to the Sensor (IMU). The set of values correspond to camera0's intrinsics.
- {T_SC:
[0.9997754002442455, -0.021161371053313276, -0.0011599317232830833, -0.03742703361193287,
0.021167568626536626, 0.999760145165724, 0.00562015806284527, 0.0059817697251900205,
0.0010407232579056848, -0.005643448711071745, 0.9999835340553093, 0.0005720705107382817,
0.0, 0.0, 0.0, 1.0],
image_dimension: [752, 480],
distortion_coefficients: [0.00953484510402244, -0.017544574626951994, 0.0196682925507835, -0.006035463565306639],
distortion_type: equidistant,
focal_length: [470.9855284431703, 470.85690553639535],
principal_point: [376.3839029041999, 247.67863814640694]}
Clausius Street dataset consists of both an air-sequence and a hand-held sequence recorded on the same day along a residential street with the camera facing the buildings of one street side. These combined sequences present extreme viewpoint changes with images captured from a very wide baseline (air to ground) and challenging illumination changes.
Clausius Street Air Sequence - Bagfile - Youtube
Clausius Street Ground Sequence - Bagfile - Youtube
The images were captured using a VI-Sensor and calibrated using ETHZ ASL Kalibr. Below are the calibration parameters. Note that T_SC is the transformation from the Camera to the Sensor (IMU). The set of values correspond to camera0's intrinsics.
- {T_SC:
[ 0.9999921569165363, 0.003945890103835121, 0.0003406709575200133, -0.030976405894694664,
-0.003948017768440125, 0.9999711543561547, 0.0064887295612456805, 0.003944069243840622,
-0.00031505731688472255, -0.0064900236445916415, 0.9999788899431723, -0.016723945219020563,
0.0, 0.0, 0.0, 1.0],
image_dimension: [752, 480],
distortion_coefficients: [0.0038403216668672986, 0.025065957244781098, -0.05227986912373674, 0.03635919730588422],
distortion_type: equidistant,
focal_length: [464.2604856754006, 463.0164764480498],
principal_point: [372.2582270417875, 235.05442086962864]}
For any questions or bug reports, please create an issue or contact me at fmaffra@mavt.ethz.ch.