This is the implementation of our CVPR'19 " HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation" (project page).
News, June 15 - Critical bug fix for general layout (dataset.py
, inference.py
and misc/post_proc.py
)
News, Aug 19 - Report results on Structured3D dataset [:bar_chart: See st3d report].
This repo is a pure python implementation that you can:
- Inference on your images to get cuboid or general shaped room layout
- 3D layout viewer
- Correct pose for your panorama images
- Pano Stretch Augmentation copy and paste to apply on your own task
- Quantitative evaluatation (3D IoU, Corner Error, Pixel Error)
- cuboid shape
- general shape
- Your own dataset preparation and training
- Python 3
- pytorch>=1.0.0
- numpy
- scipy
- sklearn
- Pillow
- tqdm
- tensorboardX
- opencv-python>=3.1 (for pre-processing)
- open3d>=0.7 (for layout 3D viewer)
- PanoContext/Stanford2D3D Dataset
- Download preprocessed pano/s2d3d for training/validation/testing
- Put all of them under
data
directory so you should get:HorizonNet/ ├──data/ | ├──layoutnet_dataset/ | | |--finetune_general/ | | |--test/ | | |--train/ | | |--valid/
test
,train
,valid
are processed from LayoutNet's cuboid dataset.finetune_general
is re-annotated by us fromtrain
andvalid
. It contains 65 general shaped rooms.
- Put all of them under
- Download preprocessed pano/s2d3d for training/validation/testing
- Structured3D Dataset
- Please contact Structured3D to get the datas.
- Following this to prepare training/validation/testing for HorizonNet.
- resnet50_rnn__panos2d3d.pth
- Trained on PanoContext/Stanford2d3d 817 pano images.
- Trained for 300 epoch.
- resnet50_rnn__st3d.pth
- Trained on Structured3D 18362 pano images with setting of original furniture and lighting.
- Trained for 50 epoch.
In below explaination, I will use assets/demo.png
for example.
- Execution: Pre-process the above
assets/demo.png
by firing below command.python preprocess.py --img_glob assets/demo.png --output_dir assets/preprocessed/
--img_glob
telling the path to your 360 room image(s).- support shell-style wildcards with quote (e.g.
"my_fasinated_img_dir/*png"
).
- support shell-style wildcards with quote (e.g.
--output_dir
telling the path to the directory for dumping the results.- See
python preprocess.py -h
for more detailed script usage help.
- Outputs: Under the given
--output_dir
, you will get results like below and prefix with source image basename.- The aligned rgb images
[SOURCE BASENAME]_aligned_rgb.png
and line segments images[SOURCE BASENAME]_aligned_line.png
- The detected vanishing points
[SOURCE BASENAME]_VP.txt
(Heredemo_VP.txt
)-0.002278 -0.500449 0.865763 0.000895 0.865764 0.500452 0.999999 -0.001137 0.000178
- The aligned rgb images
- Execution: Predict the layout from above aligned image and line segments by firing below command.
python inference.py --pth ckpt/resnet50_rnn__st3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize --relax_cuboid
--pth
path to the trained model.--img_glob
path to the preprocessed image.--output_dir
path to the directory to dump results.--visualize
optinoal for visualizing model raw outputs.--relax_cuboid
- If Model trained on cuboid only 👉 do NOT add
--relax_cuboid
to force outputing cuboid - If Model trained on general shaped 👉 ALWAYS add
--relax_cuboid
- If Model trained on cuboid only 👉 do NOT add
- Outputs: You will get results like below and prefix with source image basename.
- The 1d representation are visualized under file name
[SOURCE BASENAME].raw.png
- The extracted corners of the layout
[SOURCE BASENAME].json
{"z0": 50.0, "z1": -53.993988037109375, "uv": [[0.0146484375, 0.3008330762386322], [0.0146484375, 0.7089354991912842], [0.007335239555686712, 0.38581281900405884], [0.007335239555686712, 0.6204522848129272], [0.0517578125, 0.3912762403488159], [0.0517578125, 0.6146637797355652], [0.4485706090927124, 0.3936861753463745], [0.4485706090927124, 0.6121071577072144], [0.5978592038154602, 0.4077087640762329], [0.5978592038154602, 0.597193717956543], [0.8074917793273926, 0.35766440629959106], [0.8074917793273926, 0.6501006484031677], [0.8803366422653198, 0.2525349259376526], [0.8803366422653198, 0.7577382922172546], [0.925480306148529, 0.3167843818664551], [0.925480306148529, 0.6925708055496216]]}
- The 1d representation are visualized under file name
- Execution: Visualizing the predicted layout in 3D using points cloud.
python layout_viewer.py --img assets/preprocessed/demo_aligned_rgb.png --layout assets/inferenced/demo_aligned_rgb.json --ignore_ceiling
--img
path to preprocessed image--layout
path to the json output frominference.py
--ignore_ceiling
prevent showing ceiling- See
python layout_viewer.py -h
for usage help.
- Outputs: In the window, you can use mouse and scroll wheel to change the viewport
See tutorial on how to prepare it.
To train on a dataset, see python train.py -h
for detailed options explaination.
Example:
python train.py --id resnet50_rnn
- Important arguments:
--id
required. experiment id to name checkpoints and logs--ckpt
folder to output checkpoints (default: ./ckpt)--logs
folder to logging (default: ./logs)--pth
finetune mode if given. path to load saved checkpoint.--backbone
backbone of the network (default: resnet50)- other options:
{resnet18,resnet34,resnet50,resnet101,resnet152,resnext50_32x4d,resnext101_32x8d,densenet121,densenet169,densenet161,densenet201}
- other options:
--no_rnn
whether to remove rnn (default: False)--train_root_dir
root directory to training dataset. (default:data/layoutnet_dataset/train
)--valid_root_dir
root directory to validation dataset. (default:data/layoutnet_dataset/valid/
)--batch_size_train
training mini-batch size (default: 8)--epochs
epochs to train (default: 300)--lr
learning rate (default: 0.0001)
To evaluate on PanoContext/Stanford2d3d dataset, first running the cuboid trained model for all testing images:
python inference.py --pth ckpt/resnet50_rnn__panos2d3d.pth --img_glob "data/layoutnet_dataset/test/img/*" --output_dir tmp
--img_glob
shell-style wildcards for all testing images.--output_dir
path to the directory to dump results.
To get the quantitative result:
python eval_cuboid.py --dt_glob "tmp/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/*txt"
--dt_glob
shell-style wildcards for all the model estimation.--gt_glob
shell-style wildcards for all the ground truth. Replace"tmp/*json"
- with
"tmp/pano*json"
for evaluate on PaonContext only - with
"tmp/camera*json"
for evaluate on Stanford2D3D only
If you want to:
- just evaluate PanoContext
python eval_cuboid.py --dt_glob "tmp/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/pano*txt"
- just evaluate Stanford2d3d
python eval_cuboid.py --dt_glob "tmp/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/camera*txt"
📋 The quantitative result for the pretrained model is shown below:
Testing Dataset | 3D IoU(%) | Corner error(%) | Pixel error(%) |
---|---|---|---|
PanoContext | 83.39 |
0.76 |
2.13 |
Stanford2D3D | 84.09 |
0.63 |
2.06 |
All | 83.87 |
0.67 |
2.08 |
[:bar_chart: See st3d report] for more detail.
- Faster pre-processing script (top-fron alignment) (maybe cython implementation or fernandez2018layouts)
- Credit of this repo is shared with ChiWeiHsiao.
- Thanks limchaos for the suggestion about the potential boost by fixing the non-expected behaviour of Pytorch dataloader. (See Issue#4)
Please cite our paper for any purpose of usage.
@inproceedings{sun2019horizonnet,
title={Horizonnet: Learning room layout with 1d representation and pano stretch data augmentation},
author={Sun, Cheng and Hsiao, Chi-Wei and Sun, Min and Chen, Hwann-Tzong},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={1047--1056},
year={2019}
}