3D Object Detection

Product Description

This product identifies and labels 3D Objects in images of every day settings, such as cars, trees, bikes, pedestrians, etc.

This product makes use of a UNet, which is a Convolutional Neural Network, to identify objects, given voxel data. Our product first takes point cloud data from the SemanticKITTI dataset, and converts it to voxels. For the sake of simplicity, a voxel can be described as a 3d pixel. We visualize these voxels as cubes, each cube containing spatial information in 3 dimensions.

Model Visualizer

Link to Colab Notebook that trains the model, tests the model, and then visualizes the output of the model - https://colab.research.google.com/drive/1N3HXZgfRkDfz55EOIG3y0QOh7ADUBjaO?usp=sharing#scrollTo=pNvlnPUrTqyi

YouTube Video

Link to YouTube Video: https://youtu.be/M1d2wuesKYY

File Directory

src/visualization/voxel_grid.py: displays results as a voxel grid generated by the model.
src/cvtToPCDFunction.py: converting raw data to point clouds
src/dataGenerator.py: data generator
src/readLabels.py: processes labels from SemanticKITTI dataset
src/dispPCDFile.py: visualizes color-coded point cloud data
src/test_datagen.py: tests data generator
src/getLabels.py: fetches object labels from SemanticKITTI dataset
src/generator/__init__.py: importing DataGenerator
src/generator/index_data.py: indexes data by reformatting paths from the velodyne file
src/generator/unpack_data/labels.py: processing raw data from SemanticKITTI
src/generator/unpack_data/__init__.py: imports for labels.py
src/generator/unpack_data/velodyne.py: converts raw velodyne data to point clouds
src/generator/data_generator.py: creates usable voxel dataset from original SemanticKITTI and Velodyne data.
src/generator/preprocess_data/__init__.py: builds voxel grid based on object labels
src/getAllFiles.py: shows all files in directory
src/model/__init__.py: initializing and printing information about model status.
src/model/unet_model.py: UNet Convolutional Neural Network.
src/model/conv_blocks/down_conv.py: Down Convolutional Layer
src/model/conv_blocks/__init__.py: importing down_conv and trans_conv layers
src/model/conv_blocks/trans_conv.py: Transposed Convolutional Layer
src/__main__.py: master script - handles model training+demo, also data download and environment setup.
src/visualization/__init__.py: Registers visualization as a module.
src/visualization/voxel_grid.py: Displays model predictions from voxel grid.

Getting Started

Start by downloading the dataset with python3 -m src dl-data.
Next, you should set up your environment with python3 -m src setup-env. Make sure to run source venv/bin/activate afterwards!
You can now start training the model. Run python3 -m src train-model <data dir> <checkpoint file> <epochs> <voxel_dim>.
- For instance, you can run python3 -m src train-model data/dataset checkpoint.hdf5 10 256. However, do note that a 3D grid of size <voxel_dim> cubed will be generated, so make sure you have the RAM.
- A new checkpoints file will be created at the end of each epoch. Furthermore, if training crashes, you can re-run the same command, and if <checkpoint file> exists, training will continue from that.
Finally, you can try out your trained model with python3 -m src demo-model <data dir> <checkpoint file> <voxel_dim>
- Make sure you have the same voxel dim as the one in the checkpoint file.
- For instance, you can run python3 -m src demo-model data/dataset checkpoint.hdf5 256.

Citations

J. Behley, A. Milioto, C. Stachniss, M. Garbade, J. Gall, J. Quenzel, and S. Behnke, “Semantic Kitti Dataset Overview,” Semantickitti - A dataset for LIDAR-based Semantic Scene Understanding, 2020. [Online]. Available: http://www.semantic-kitti.org/dataset.html. [Accessed: 08-Dec-2021].