LSeg Per-Pixel Feature Extraction

The repo contains the modified implementation of ICLR'22 paper LSeg, where you can extract per-pixel features for any images. Also, we provide how multi-view LSeg feature fusion is done in the CVPR'23 paper OpenScene.


Follow the official installation instruction to install the environment.

Fail in installing LSeg? You are not alone. To me it is always not easy to install LSeg by following their instruction. I provide how I successfully installed it (under GCC=0.9.3, CUDA=11.3) below.

conda create -n lseg python=3.8
conda activate lseg

pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f
pip install pytorch_lightning==1.4.9
pip install git+  # this step takes >5 minutes

pip install git+
pip install timm==0.5.4
pip install torchmetrics==0.6.0
pip install setuptools==59.5.0
pip install imageio matplotlib pandas six

Next, download the official checkpoint and save in checkpoints/demo_e200.ckpt.

Note: You should also follow here. So, there should be a ../datasets/ folder on the parent level, and have the corresponding ADE20K data there, even though we don't need really it.

Extract the Per-Pixel LSeg Feature for Images

If you want to extract LSeg per-pixel features and save locally, please check

python --data_dir data/example/ --output_dir data/example_output/ --img_long_side 320


  • data_dir is the folder where contains RGB images
  • output_dir is the folder where saves the corresponding LSeg features
  • img_long_side is the length of the long side of your image. For example, for an image with a resolution of [640, 480], img_long_side is 640.

Multi-View LSeg Feature Fusion

Here we provide the codes for how multi-view fusion mentioned in Section 3.1 in OpenScene is done with LSeg features. We provide the codes for fusion on different datasets, including ScanNet, Matterport3D, and nucenes:,, and


Follow the instruction to obtain the processed 2D and 3D data of the corresponding dataset.


Take as an example, to perform multi-view LSeg feature fusion, your can run:

python --data_dir PATH/TO/scannet_processed  --output_dir PATH/TO/OUTPUT_DIR --process_id_range 0,100 --split train


  • data_dir: path to the pre-processed 2D&3D data
  • output_dir: output directory to save your fused features
  • openseg_model: path to the OpenSeg model
  • process_id_range: only process scenes within the range
  • split: choose from train/val/test data to process

This multi-view fusion corresponds to the part in the OpenScene official repo.

Key Modifications over the Original LSeg

  • Support outputting per-pixel 512-dim feature. See here and here.
  • When extracting per-pixel features, no multi-scale features are considered, only a single scale. See here.
  • Change the crop_size and base_size according to the length of the long side of the image. See here.


