EfficientLPS Demo

Top-Down Deep Convolutional Neural Networks approch for LiDAR Panoptic Segmentation

This demo shows the LiDAR panoptic segmentation performance of our EfficeintLPS model trained on SemanticKITTI and NuScenes datasets. EfficientLPS is currently ranked #1 for LiDAR panoptic segmentation on the SemanticKITTI leaderboard. To learn more about LiDAR panoptic segmentation and the approach employed, please see the Technical Approach. View the demo by selecting a dataset to load from the drop down box below and click on a LiDAR scan in the carosel to see live results.

Please Select a Model:

Selected Dataset:

SemanticKITTI

Technical Approach

What is Panoptic Segmentation?

Network architecture

Autonomous vehicles are required to operate in challenging urban environments that consist of a wide variety of agents and objects, making comprehensive perception a critical task for robust and safe navigation. Typically, perception tasks are focused on independently reasoning about the semantics of the environment and recognition of object instances. The task of panoptic segmentation is a scene understanding problem that aims to provide a holistic solution by unifiying semanitc and instance segmentation tasks. Panoptic segmentation simultaneously segments the scene into ‘stuff’ classes that comprise of background objects or amorphous regions such as road, vegetation, and buildings, as well as ‘thing’ classes that represent distinct foreground objects such as cars, cyclists, and pedestrians.

Panoptic segmentation has been extensively studied in the image domain, facilitated by the ordered structure of images being supported by well-researched convolutional networks. However, only a handful of methods have been proposed for panoptic segmentation of LiDAR point clouds. The typical unordered, sparse, and irregular structure of point clouds pose several challenges such as distance-dependent sparsity, severe occlusions, large scale-variations, and re-projection errors. To address these issues, we propose a top-down architecture EfficientLPS that incorporates our propose range enforced components and a fusion module supervised by our panoptic periphery loss function.


EfficientLPS Architecture

EfficientLPS consists ofa shared backbone comprising our novel Proximity Convolution Module (PCM), an encoder, the proposed Range-aware FPN(RFPN) and the Range Encoder Network (REN). We build the encoder and REN based on the EfficientNet family. EfficientLPS also consists of a novel distance-dependent semantic segmentation head and an instance segmentation head, followed by a fusion module that provides the panoptic segmentation output. Our network makes several new contributions to address the problems that persist in LiDAR cylindrical projections.


Network architecture
Figure: Illustration of our proposed EfficientLPS architecture for LiDAR panoptic segmentation. The point cloud is first projected into the 2D domain using scan unfolding and fed as an input to our Proximity Convolution Module (PCM). Subsequently, we employ the shared backbone consisting of the EfficientNet encoder with the 2-way FPN and the Range Encoder Network (REN) in parallel. The output of these two modules are fused and fed as input to the semantic and instance heads. The logits from both heads are then combined in the panoptic fusion module which is supervised by the panoptic periphery loss function. Finally, the output of the panoptic fusion module is projected back to the 3D domain using a kNN algorithm.

The proximity convolution module which boosts the transformation modeling capacity of the shared backbone by leveraging range proximity between neighboring points. Followed by the novel range-aware feature pyramid network that reinforces bidirectionally aggregated semantically rich multi-scale features with spatial awareness. Subsequently, a new semantic head that captures scale-invariant rich characteristic and contextual features using our range-guided depth-wise separable atrous convolutions. Additionally, a novel panoptic periphery loss function that refines the segmentation of 'thing' instances by maximizing the range separation between foreground boundary pixels and neighboring background pixels. Lastly, a new framework for improving panoptic segmentation of LiDAR point clouds by exploiting large unlabelled datasets via regularized pseudo label generation.


NuScenes LiDAR Panoptic Segmentation Dataset


We introduce the NuScenes LiDAR panoptic segmentation dataset for autonomous driving that provides panoptic annotations for NuScenes. The dataset consists of a total of 850 scans, out of which 700 are used for the training set and 150 are used for the validation set. We provide annotations for 6 ‘stuff’ classes and 10 ‘thing’ classes.

License Agreement

The data is provided for non-commercial use only. By downloading the data, you accept the license agreement which can be downloaded here. If you report results based on the NuScenes LiDAR Panoptic Segmentation dataset, please consider citing the paper mentioned in the Publications section.

Videos

Code

A software implementation of this project based on PyTorch can be found in our GitHub repository for academic usage and is released under the GPLv3 license.

Publications

Kshitij Sirohi, Rohit Mohan, Daniel Büscher, Wolfram Burgard, Abhinav Valada, "EfficientLPS: Efficient LiDAR Panoptic Segmentation",
arXiv preprint arXiv:2102.08009, 2021.

(Pdf) (Bibtex)


People