This product identifies and labels 3D Objects in images of every day settings, such as cars, trees, bikes, pedestrians, etc.
This product makes use of a UNet, which is a Convolutional Neural Network, to identify objects, given voxel data. Our product first takes point cloud data from the SemanticKITTI dataset, and converts it to voxels. For the sake of simplicity, a voxel can be described as a 3d pixel. We visualize these voxels as cubes, each cube containing spatial information in 3 dimensions.
Link to Colab Notebook that trains the model, tests the model, and then visualizes the output of the model - https://colab.research.google.com/drive/1N3HXZgfRkDfz55EOIG3y0QOh7ADUBjaO?usp=sharing#scrollTo=pNvlnPUrTqyi
Link to YouTube Video: https://youtu.be/M1d2wuesKYY
src/visualization/voxel_grid.py
: displays results as a voxel grid generated by the model.src/cvtToPCDFunction.py
: converting raw data to point cloudssrc/dataGenerator.py
: data generatorsrc/readLabels.py
: processes labels from SemanticKITTI datasetsrc/dispPCDFile.py
: visualizes color-coded point cloud datasrc/test_datagen.py
: tests data generatorsrc/getLabels.py
: fetches object labels from SemanticKITTI datasetsrc/generator/__init__.py
: importing DataGeneratorsrc/generator/index_data.py
: indexes data by reformatting paths from the velodyne filesrc/generator/unpack_data/labels.py
: processing raw data from SemanticKITTIsrc/generator/unpack_data/__init__.py
: imports for labels.pysrc/generator/unpack_data/velodyne.py
: converts raw velodyne data to point cloudssrc/generator/data_generator.py
: creates usable voxel dataset from original SemanticKITTI and Velodyne data.src/generator/preprocess_data/__init__.py
: builds voxel grid based on object labelssrc/getAllFiles.py
: shows all files in directorysrc/model/__init__.py
: initializing and printing information about model status.src/model/unet_model.py
: UNet Convolutional Neural Network.src/model/conv_blocks/down_conv.py
: Down Convolutional Layersrc/model/conv_blocks/__init__.py
: importing down_conv and trans_conv layerssrc/model/conv_blocks/trans_conv.py
: Transposed Convolutional Layersrc/__main__.py
: master script - handles model training+demo, also data download and environment setup.src/visualization/__init__.py
: Registersvisualization
as a module.src/visualization/voxel_grid.py
: Displays model predictions from voxel grid.
- Start by downloading the dataset with
python3 -m src dl-data
. - Next, you should set up your environment with
python3 -m src setup-env
. Make sure to runsource venv/bin/activate
afterwards! - You can now start training the model. Run
python3 -m src train-model <data dir> <checkpoint file> <epochs> <voxel_dim>
.- For instance, you can run
python3 -m src train-model data/dataset checkpoint.hdf5 10 256
. However, do note that a 3D grid of size<voxel_dim>
cubed will be generated, so make sure you have the RAM. - A new checkpoints file will be created at the end of each epoch. Furthermore, if training crashes, you can re-run the same
command, and if
<checkpoint file>
exists, training will continue from that.
- For instance, you can run
- Finally, you can try out your trained model with
python3 -m src demo-model <data dir> <checkpoint file> <voxel_dim>
- Make sure you have the same voxel dim as the one in the checkpoint file.
- For instance, you can run
python3 -m src demo-model data/dataset checkpoint.hdf5 256
.
J. Behley, A. Milioto, C. Stachniss, M. Garbade, J. Gall, J. Quenzel, and S. Behnke, “Semantic Kitti Dataset Overview,” Semantickitti - A dataset for LIDAR-based Semantic Scene Understanding, 2020. [Online]. Available: http://www.semantic-kitti.org/dataset.html. [Accessed: 08-Dec-2021].