Comprehensive list of Autonomous Vehicles Datasets (papers and dataset download links) with multiple sensor modalities (LiDAR, RADAR, Stereo Camera, Thermal Camera etc.).
We Need You!
Please help contribute this list by adding a pull request.
Please help contribute this list by adding a pull request.
A wide variety of sensors are used in autonomous vehicles. The diversity of sensing modalities helps in different weather conditions. The following is a popular list of autonomous driving datasets which have been published up to date.
Name | Published Year | Sensor Type(s) | Recording Area(s) | Description | Dataset Download and Paper(s) |
---|---|---|---|---|---|
Argoverse 2 | 2023 | Camera, LiDAR | United States | Argoverse 2 (AV2) is a collection of three datasets for self-driving research. The Sensor Dataset contains 1,000 sequences of multimodal data, including high-resolution imagery from seven ring cameras, two stereo cameras, lidar point clouds, and 6-DOF map-aligned pose, with annotations for 26 object categories. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose, and the Motion Forecasting Dataset contains 250,000 scenarios for predicting future motion of "scored actors" in complex interactions with an autonomous vehicle. All datasets include HD Maps with 3D lane and crosswalk geometry from six cities. | Dataset, Paper |
ONCE Dataset (One millioN sCenEs)- Huawei Corp. | 2021 | Camera, LiDAR | China | ONCE(One millioN sCenEs) dataset can be used for 3D object detection in the autonomous driving scenario. The ONCE dataset consists of 1million LiDAR scenes and 7 million corresponding camera images. The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available (e.g. nuScenes and Waymo), and it is collected across a range of different areas, periods and weather conditions. It has 15k fully annotated scenes with 5 classes (Car, Bus, Truck, Pedestrian, Cyclist). In the ONCE dataset, there are 3 weather conditions, i.e., sunny, cloudy, rainy, and 4 time periods, i.e., morning, noon, afternoon, night, for every labeled and unlabeled scene. The information, i.e., weather, period, timestamp, pose, calibration, annotations, are in a single JSON file for each scene. | Dataset , Paper |
All-In-One Drive (AIODrive) | 2020 | Camera, LiDAR, Radar | NA | A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds. It is a large-scale synthetic dataset that provides comprehensive sensors, annotations, and environmental variations. It has
|
Dataset, Paper |
Ford Multi-AV Seasonal dataset | 2020 | Camera, LiDAR | United States (Michigan) | The multi-agent seasonal dataset was collected by a fleet of Ford autonomous vehicles on different days and times during 2017–18. The vehicles were manually driven on a route in Michigan that included a mix of driving scenarios, including the Detroit Airport, freeways, city-centers, university campus, and suburban neighborhood. The dataset has seasonal variation in weather, lighting, construction, and traffic conditions experienced in dynamic urban environments. | Dataset, Paper |
Dense Depth for Autonomous Driving (DDAD) — Toyota Research Institute | 2020 | Camera, LiDAR | United States (San Francisco, Bay Area, Cambridge, Detroit, Ann Arbor) and Japan (Tokyo, Odaiba) | DDAD is a new autonomous driving benchmark from TRI (Toyota Research Institute) for long range (up to 250m) and dense depth estimation in challenging and diverse urban conditions. It contains monocular videos and accurate ground-truth depth (across a full 360 degree field of view) generated from high-density LiDARs mounted on a fleet of self-driving cars operating in a cross-continental setting. | Dataset , Paper |
PandaSet | 2020 | Camera, LiDAR | United States (San Francisco, El Camino Real from Palo Alto to San Mateo) | PandaSet combines Hesai’s best-in-class LiDAR sensors with Scale AI’s high-quality data annotation. PandaSet features data collected using a forward-facing LiDAR with image-like resolution (PandarGT) as well as a mechanical spinning LiDAR (Pandar64). The collected data was annotated with a combination of cuboid and segmentation annotation (Scale 3D Sensor Fusion Segmentation). | Dataset |
Canadian Adverse Driving Conditions (CADC) | 2020 | Camera, LiDAR | Canada (Waterloo) | The Canadian Adverse Driving Conditions (CADC) dataset was collected with the Autonomoose autonomous vehicle platform, based on a modified Lincoln MKZ. The dataset, collected during winter within the region of Waterloo, Canada, is the first autonomous vehicle dataset that focuses on adverse driving conditions specifically. It contains 7,000 frames collected through a variety of winter weather conditions of annotated data from 8 cameras (Ximea MQ013CG-E2), Lidar (VLP-32C), and a GNSS+INS system (Novatel OEM638). The sensors are time-synchronized and calibrated with the intrinsic and extrinsic calibrations included in the dataset. Lidar frame annotations that represent ground truth for 3D object detection and tracking have been provided by Scale AI. | Dataset, Paper |
A2D2: Audi Autonomous Driving Dataset | 2020 | Camera, LiDAR, Bus data | Germany (Gaimersheim, Munich, and Ingolstadt) | The dataset consists of simultaneously recorded images and 3D point clouds, together with 3D bounding boxes, semantic segmentation, instance segmentation, and data extracted from the automotive bus. The sensor suite consists of six cameras and five LiDAR units, providing full 360-degree coverage. The recorded data is time-synchronized and mutually registered. The dataset features 2D semantic segmentation, 3D point clouds, 3D bounding boxes, and vehicle bus data. All sensor signals are timestamped in UTC format. | Dataset, Paper |
A*3D Dataset | 2019 | Camera, LiDAR | Singapore | The A*3D dataset consists of RGB images and LiDAR data with a significant diversity of the scene, time, and weather. The dataset consists of high-density images (≈10times more than the pioneering KITTI dataset), heavy occlusions, a large number of night-time frames (≈3times the scenes dataset), addressing the gaps in the existing datasets to push the boundaries of tasks in autonomous driving research to more challenging highly diverse environments. The data collection covers the entire Singapore, including highways, neighborhood roads, tunnels, urban, suburban, industrial, HDB car parks, coastline, etc. | Dataset, Paper |
EuroCity Persons (ECP) | 2019 | Camera, LiDAR | Europe (31 cities in 12 countries) | The EuroCity Persons dataset provides a large number of highly diverse, accurate, and detailed annotations of pedestrians, cyclists, and other riders in urban traffic scenes. The images for this dataset were collected on-board a moving vehicle in 31 cities of 12 European countries. With over 238200 person instances manually labeled in over 47300 images, EuroCity Persons is nearly one order of magnitude larger than person datasets used previously for benchmarking. The dataset furthermore contains a large number of person orientation annotations (over 211200). | Dataset, Paper |
Oxford RobotCar Dataset | 2019 and 2016 | 2019: Camera, Radar, LiDAR. 2016: Camera, LiDAR | UK (Oxford) | 2019: The Oxford Radar Robot-Car dataset can be used for researching scene understanding using Millimetre-Wave FMCW scanning radar data. The target application is autonomous vehicles, where this modality is robust to environmental conditions such as fog, rain, snow, or lens flare, which typically challenge other sensor modalities such as vision and LIDAR. The data were gathered in January 2019 over thirty-two traversals of a central Oxford route spanning a total of 280 km in urban driving. It encompasses a variety of weather, traffic, and lighting conditions. 2016: The original dataset release consisted of over 20 TB of vehicle-mounted monocular and stereo imagery, 2d And 3D LIDAR, as well as inertial and GPS data collected over a year of driving in Oxford, UK. More than100 traversals of a 10km route were performed over this period to capture scene variation over a range of timescales, from 24 day/night illumination cycle to long-term seasonal variations. | 2019 dataset, 2019 Paper, 2016 dataset, 2016 Paper |
Waymo Open Dataset | 2021 and 2019 | Camera, LiDAR | United States (San Francisco, Mountain View, Los Angeles, Detroit, Seattle, Phoenix) | The Waymo Open Dataset was first launched in August 2019 with a perception dataset comprising high-resolution sensor data and labels for 1,950 segments. In March 2021, expanded the Waymo Open Dataset to also include a motion dataset comprising object trajectories and corresponding 3D maps for 103,354 segments. Waymo Open Dataset includes a wide variety of environments, objects, and weather conditions (Downtown, Suburban, Daylight, Night Time, Pedestrians, Cyclists, Construction, Diverse Weather). | Dataset, 2021 Paper, 2019 Paper |
Lyft Level 5 Dataset | 2019 | Camera, LiDAR, Radar | United States (Palo Alto) | The dataset can be used for motion prediction with over 1,000 hours of data. This was collected by a fleet of 20 autonomous vehicles along a fixed route in Palo Alto, California, over a four-month period. It Consists of 170,000 scenes, where each scene is 25 seconds long and captures the perception output of the self-driving system, which encodes the precise positions and motions of nearby vehicles, cyclists, and pedestrians over time. On Top of this, the dataset contains a high-definition semantic map with 15,242 labeled elements and a high-definition aerial view over the area. The datasets include a high-definition semantic map to provide context about traffic agents and their motion. The map features over 4,000 manually annotated semantic elements, including lane segments, pedestrian crosswalks, stop signs, parking zones, speed bumps, and speed humps. Datasets include elements from real-world scenarios, including vehicles, pedestrians, intersections, and multi-lane traffic. | Dataset, Paper |
Argoverse | 2019 | Camera, LiDAR | United States (Pittsburg, Miami) | Argoverse was collected by a fleet of autonomous vehicles in Pittsburgh and Miami. The Argoverse 3D Tracking dataset includes 360-degree images from 7 cameras with overlapping fields of view, 3D point clouds from long-range LiDAR, 6-DOF pose, and 3D track annotations. It provides forward-facing stereo imagery. The Argoverse Motion Forecasting dataset includes more than 300,000 5-second tracked scenarios with a particular vehicle identified for trajectory forecasting. Argoverse is the first autonomous vehicle dataset to include “HD maps” with 290 km of mapped lanes with geometric and semantic metadata. It provides rich semantic information about road infrastructure and traffic rules. | Dataset, Paper |
nuScenes dataset | 2019 | Camera, LiDAR, Radar | United States (Boston), Singapore | nuTonomy scenes (nuScenes) carry the fully autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with a full 360-degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. The data is from Boston (Seaport and South Boston) and Singapore (One North, Holland Village, and Queenstown), two cities that are known for their dense traffic and highly challenging driving situations. There is diversity across locations in terms of vegetation, buildings, vehicles, road markings, and right versus left-hand traffic. | Dataset, Paper |
BLVD: Building A Large-scale 5D Semantics Benchmark for Autonomous Driving | 2019 | Camera, 3D LiDAR | China(Changshu) | BLVD, a large-scale 5D semantics benchmark, aims to provide a platform for the tasks of dynamic 4D (3D+temporal) tracking, 5D (4D+interactive) interactive event recognition, and intention prediction. BLVD dataset contains 654 high-resolution video clips owing 120k frames extracted from Changshu, Jiangsu Province, China, where the Intelligent Vehicle Proving Center of China (IVPCC) is located. The frame rate is 10fps/sec for RGB data and 3D point cloud. Fully annotated all the frames and totally yielded 249, 129 3D annotations, 4, 902 independent individuals for tracking with the length of overall 214, 922 points, 6, 004 valid fragments for 5D interactive event recognition, and 4,900 individuals for 5D intention prediction. These tasks are contained in four kinds of scenarios depending on the object density (low and high) and light conditions (daytime and nighttime). | Dataset, Paper |
H3D - Honda 3D Dataset | 2019 | Camera, LiDAR | United States (San Francisco) | Honda Research Institute 3D Dataset (H3D), large-scale full-surround 3D multi-object detection and tracking dataset collected using a 3D LiDAR scanner. H3D comprises 160 crowded and highly interactive traffic scenes with a total of 1 million labeled instances in 27,721 frames. With unique dataset size, rich annotations, and complex scenes, H3Dis gathered to stimulate research on full-surround 3D multi-object detection and tracking. It is gathered from the HDD dataset, a large-scale naturalistic driving dataset collected in the San Francisco Bay Area. H3D consists of 1) Full 360-degree LiDAR dataset (dense point cloud from Velodyne-64) 2) 160 crowded and highly interactive traffic scenes 3) 1,071,302 3D bounding box labels 4) 8 common classes of traffic participants (Manually annotated every 2Hz and linearly propagated for 10 Hz data) 9) Benchmarked on state-of-the art algorithms for 3D only detection and tracking algorithms. | Dataset, Paper |
ApolloScape | 2019 | Camera, LiDAR | China | ApolloScape contains much large and richer labeling, including holistic semantic dense point cloud for each site, stereo, per-pixel semantic labeling, lane mark labeling, instance segmentation, 3D car instance, high accurate location for every frame in various driving videos from multiple sites, cities, and daytimes. The dataset contains 140K+ annotated images with annotation of lanes. For 3D object detection, it annotates 3D bounding boxes of objects in 6K+ point clouds. It consists of data from 4 regions in China in various weather conditions. | Dataset, Paper |
DBNet | 2018 | Camera, LiDAR | China | DBNet is a large-scale dataset for driving behavior research. It includes aligned video, point cloud, GPS, and driver behavior (speed and wheel), which captures 1000 KM real-world driving data. The LiDAR-Video dataset provides large-scale, high-quality point clouds scanned by a Velodyne laser, videos recorded by a dashboard camera, and standard drivers’ behaviors. | Dataset, Paper |
KAIST multispectral dataset (2018) and KAIST Multispectral Pedestrian dataset (2015) | 2018 and 2015 | 2018 - Camera (Visual and Thermal), LiDAR. 2015 - Camera (Visual and Thermal) | South Korea (Seoul) | 2018: Dataset provides different perspectives of the world captured in coarse time slots (day and night), in addition to fine time slots (sunrise, morning, afternoon, sunset, night, and dawn). For the all-day perception of autonomous systems, a thermal imaging camera can be used. Toward this goal, developed a multi-sensor platform, which supports the use of a co-aligned RGB/Thermal camera, RGB stereo, 3-D LiDAR, and inertial sensors (GPS/IMU) and a related calibration technique. 2015: Multispectral pedestrian dataset provides well-aligned color-thermal image pairs captured by beam splitter-based special hardware. The color-thermal dataset is as large as previous color-based datasets and provides dense annotations, including temporal correspondences. With this dataset, the team introduced multispectral ACF, which is an extension of aggregated channel features (ACF) to simultaneously handle color-thermal image pairs. Multi-spectral ACF reduces the average miss rate of ACF by 15%. | 2018 Dataset, 2018 paper, 2018 paper download option-2, 2015 dataset, 2015 paper |
FLIR ADAS Dataset | 2018 | Camera (Visual and Thermal) | United States (Santa Barbara) | The dataset features a compilation of more than 10,000 annotated thermal images of people, cars, other vehicles, bicycles, and dogs in day and night time scenarios. It is primarily captured in streets and highways in Santa Barbara, California, the US, with clear-sky conditions both day and night. Annotations exist for the thermal images based on the COCO annotation scheme. However, no annotations exist for the corresponding visible images. | Dataset |
KITTI | 2015, 2013, and 2012 | Camera, LiDAR | Germany (Karlsruhe) | For the tasks stereo, optical flow, visual odometry, 3D object detection, and 3D tracking, equipped a standard station wagon with two high-resolution color and grayscale video cameras. Accurate ground truth is provided by a Velodyne laser scanner and a GPS localization system. Datasets are captured by driving around the mid-size city of Karlsruhe, in rural areas, and on highways. Up to 15 cars and 30 pedestrians are visible per image, besides providing all data in raw format, extracted benchmarks for each task. | Dataset, 2012 Paper, 2012 Paper download option-2, 2013 Paper-1, 2013 paper-1 download option-2, 2013 Paper-2, 2013 Paper-2 download option-2, 2015 Paper, 2015 Paper download option-2 |