
Pyramid Scene Parsing Network in 3D: improving semantic seg- mentation of point clouds with multi-scale contextual information

Primary LanguagePython

Pyramid Scene Parsing Network in 3D: improving semantic segmentation of point clouds with multi-scale contextual information. ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 154, 2019, Fang H., Lafarge F.

We propose a 3D pyramid module to enrich pointwise features with multi-scale contextual information. The goal of our work is not to achieve state-of-the-art performances on the datasets, but to propose a generic module that can be concatenated with any 3D neural network to infer richer pointwise features.



The architecture of our 3d-PSPNet is inspired by the succes of PSPNet applied on 2D images.

Architecture of our model


Please following PointNet++ to install the corresponding version of python3.6, tensorflow 1.4.0, and install user defined operators in tf_ops.

Semantic Segmentation


Please following the pipeline used in PointNet.


First, please train the baseline model from scratch.

python train.py --log_dir log6 --test_area 6

Because our 3d-PSPNet is imposed to incorporate multi-scale contextual information for each point, we conclude a fine-tuning strategy to obtain enriched pointwise feature.

python train_pyramid.py --log_dir log6_pyramid --test_area 6 --model_path log6/model.ckpt

Note that the implementation of 3d-PSPNet in train_pyramid.py is based on a composition of tensorflow operators. For the efficient issue, we also implement two cuda based tensoflow operators grid_pooling and grid_upsampling. Users can use them for training by

python train_pyramid_cuda.py --log_dir log6_pyramid_cuda --test_area 6 --model_path log6/model.ckpt


Users can evaluate the trained model by

python batch_inference_pyramid.py --model_path log6_pyramid/model.ckpt --dump_dir log6_pyramid/dump_new --output_filelist log6_pyramid/output_filelist.txt --room_data_filelist meta/area6_data_label.txt --visu

or by

python batch_inference_pyramid_cuda.py --model_path log6_pyramid_cuda/model.ckpt --dump_dir log6_pyramid_cuda/dump_new --output_filelist log6_pyramid_cuda/output_filelist.txt --room_data_filelist meta/area6_data_label.txt --visu

Finally, evaluate overall segmentation accuracy by

python eval_iou_accuracy.py


If you find our work useful for your reasearch topic, please cite our paper by

author = {Fang, Hao and Lafarge, Florent},
title = {{Pyramid scene parsing network in 3D: Improving semantic segmentation of point clouds with multi-scale contextual information}},
journal = {ISPRS Journal of Photogrammetry and Remote Sensing},
volume = {154},
year = {2019},	


MIT License


The main structure of our code is based on PointNet. The cuda implementation of our tf_ops is inspired by RSNet.