/SensatUrban_albert

UCL CGVI

Primary LanguageC++MIT LicenseMIT

SensatUrban_albert

UCL CGVI Absrtact

The goal of this project is to systematically study the 3D semantic segmentation and improve the performance of the network on the urban-scaled point cloud dataset "SensatUrban”. Ideally, train a robust network that can fuse and understand noisy urban data from multiple sources and create a semantically labeled 3D model of urban scale.

In this project we studied and evaluated three state-of-art network structures that can be used in semantic segmentation: randlanet, multi-head attention layer and point transformer.

Various data enhancement techniques based on KPConv and PointNext are also applied to the network. Eventually, compare to the baseline, we achieved an 8.0% of mIoU increasement on the test set.

image

Thesis: https://drive.google.com/file/d/1BsRbLxOYe0Xi1sOM2a7SibiqO4zHw2oH/view?usp=sharing

Randlanet 100 epochs model and log file(baseline): https://drive.google.com/file/d/1wJlDjykVdnZBe4RXw6J6l01bmBAM2XNr/view?usp=sharing

Pointtrans 50 epochs model and log file(final model): https://drive.google.com/file/d/1b9M7IMOTrEX5qt80tZK8f4R23upv6f_i/view?usp=share_link

Overall Structure Of The Network:

image

Result:

pointtrans

Details:

image

For robust, unique architecture like the big stadium, the randlanet misclassifies many points into the other classes, whereas the final model made fewer mistakes.

The final model correctly classifies more points than the baseline for high morphable classes like water. In baseline, many points in Water are classified into FootPath(pink).

Model Performance Comparison

Test set scores:

image

Baseline and final model comparison:

image