Codebase for deep autonomous driving perception tasks

pytorch-auto-drive is a pure Python codebase includes semantic segmentation models, lane detection models, based on PyTorch with mixed precision training. For example, you do not need matlab to test on CULane.

This repository is under active development, results with models uploaded are stable.

Highlights

Various methods tested on a wide range of backbones, modulated and easily understood codes, image/keypoint loading, transformations and visualizations, mixed precision training and tensorboard logging.

Models from this repo are faster to train (single card trainable) and often have better performance than other implementations, see wiki for reasons and technical specification of models.

Supported datasets:

Task	Dataset
semantic segmentation	PASCAL VOC 2012
semantic segmentation	Cityscapes
semantic segmentation	GTAV*
semantic segmentation	SYNTHIA*
lane detection	CULane
lane detection	TuSimple
lane detection	BDD100K (In progress)
lane detection	LLAMAS (In progress)

* The UDA baseline setup, with Cityscapes val set as validation.

Supported models:

Task	Backbone	Model/Method
semantic segmentation	ResNet-101	FCN
semantic segmentation	ResNet-101	DeeplabV2
semantic segmentation	ResNet-101	DeeplabV3
semantic segmentation	-	ENet
semantic segmentation	-	ERFNet
lane detection	ENet, ERFNet, VGG16, ResNets (18, 34, 50, 101)	Baseline
lane detection	ERFNet, VGG16, ResNets (18, 34, 50, 101)	SCNN
lane detection	VGG16, ResNets (18, 34, 50, 101)	RESA (In progress)
lane detection	ERFNet, ENet	SAD (In progress)
lane detection	ERFNet	PRNet (In progress)
lane detection	ERFNet, ResNet18-reduced	LSTR (In progress)

The VGG16 backbone corresponds to DeepLab-LargeFOV in SCNN.

The ResNet backbone corresponds to DeepLabV2 (w.o. ASPP) with output channels reduced to 128 as in RESA.

We keep calling it VGG16/ResNet for consistency with common practices.

Model Zoo

We provide solid results (average/best/detailed), training time, shell scripts and trained models available for download in MODEL_ZOO.md.

Installation

Please prepare the environment and code with INSTALL.md. Then follow the instructions in DATASET.md to set up datasets.

Getting Started

Get started with LANEDETECTION.md for lane detection.

Get started with SEGMENTATION.md for semantic segmentation.

Visualization Tools

Refer to VISUALIZATION.md for a visualization tutorial.

Contributing

We welcome Pull Requests to fix bugs, update docs or implement new features etc. We also welcome Issues to report problems and needs, or ask questions (since your question might be more common and helpful to the community than you presume). Interested folks should checkout our roadmap.

This repository implements (or plan to implement) the following interesting papers in a unified PyTorch codebase:

Fully Convolutional Networks for Semantic Segmentation CVPR 2015

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs TPAMI 2017

Rethinking Atrous Convolution for Semantic Image Segmentation ArXiv preprint 2017

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation ArXiv preprint 2016

ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation ITS 2017

Spatial As Deep: Spatial CNN for Traffic Scene Understanding AAAI 2018

RESA: Recurrent Feature-Shift Aggregator for Lane Detection AAAI 2021

Learning Lightweight Lane Detection CNNs by Self Attention Distillation ICCV 2019

Polynomial Regression Network for Variable-Number Lane Detection ECCV 2020

End-to-end Lane Shape Prediction with Transformers WACV 2021

You are also welcomed to make additions on this paper list, or open-source your related works here.

Notes:

Cityscapes dataset is down-sampled by 2 when training at 256 x 512, to specify different sizes, modify them in configs.yaml; similar changes can be done with other experiments.
Training times are measured on a single RTX 2080Ti, including online validation time for segmentation, test time for lane detection.
All segmentation results reported are from single model without CRF and without multi-scale testing.

minghongli233/pytorch-auto-drive