By Chen Liu, Jiajun Wu, Pushmeet Kohli, and Yasutaka Furukawa
This paper addresses the problem of converting a rasterized floorplan image into a vector-graphics representation. Our algorithm significantly outperforms existing methods and achieves around 90% precision and recall, getting to the range of production-ready performance. To learn more, please refer to our ICCV 2017 paper or visit our project website.
This code implements the algorithm described in our paper in Torch7.
[12/21/2018] A PyTorch version is now available under folder pytorch/. It is much easier to compile and try. Please see the README file under the folder for details. Note that we haven't evaluated the performance of it yet. We also provide a free IP solver (not relying on Gurobi) at pytorch/IP.py.
[7/1/2018] For annotator codes, please see here.
[4/15/2018] We have a follow-up project which reconstructs floorplans from 3D scans. You can find it here.
- Please install the latest Torch.
- Please install Python 2.7.
- We used a Nvidia Titan GPU with CUDA 8.0 installed.
- nn
- cunn
- cudnn
- image
- ffi
- csvigo
- penlight
- opencv (Probably need to be compiled from source)
- lunatic-python
To use our trained model, please first download it from Google Drive, and put it under folder "checkpoint/" (or specify the its path via option -loadModel="path to the downloaded model").
Our model is fine-tuned based on the pose estimation network introduced in the paper, "Human pose estimation via Convolutional Part Heatmap Regression". You can downloaded their model here (the MPII one), and put it under folder "PoseEstimation/" (or specify the its path via option -loadPoseEstimationModel)
We don't have the permission to share the rasterized images, which are from the LIFULL dataset. Here we only share our vector-graphics annotations which might be helpful for other tasks.
Our vector-graphics annotation is under "data/floorplan_representation" folder.
Each row in vector graphics annotations contains (x_min, y_min, x_max, y_max, category, dump_1, dump_2). Category can be either a wall, a door (opening in the paper), a specific icon type, or a specific room type. For walls and doors, two points, (x_min, y_min) and (x_max, y_max), form a line. For icons, x_min, y_min, x_max, and y_max specify a rectangle. For rooms, however, x_min, y_min, x_max, and y_max are unfortunately not for the bounding box of the room, as a room can be of arbitrary shape instead of a rectangle. So, x_min, y_min, x_max, and y_max just denote an arbitrary region which falls inside the room. Please refer to the data loader code to see how to process such annotations.
Here is the link to 100,000+ vector-graphics representation generated by our algorithm. You might want to get 3D popup models from the text files using the popup code below or draw 2D rendering images using the function fp_ut.drawRepresentationImage(floorplan, representation) (please see predict.lua for an example). Note that, since we cannot share the image data, you might need to change the code for either generating 3D popup models or rendering the 2D images.
The code for the annotator is available under folder annotator/. You can find a similar annotator written using Python here.
To train the network from the pretrained pose estimation network, simply run
th main.lua -loadPoseEstimationModel "path to the downloaded pose estimation model"
To load our trained model and resume training, please run
th main.lua -loadModel "path to the downloaded pretrained model"
Here are som useful options for the main script:
- -batchSize specifies the batch size
- -LR specifies the learning rate
- -nEpochs specifies the number of epochs
- -checkpointEpochInterval specifies the number of training epochs between two checkpoints (useful if you want to save less number of checkpoints instead of saving one checkpoint for every epoch)
- useCheckpoint specifies how the training resumes
- -1: starting from the beginning even when checkpoints previously trained are found
- 0 (default) resuming from checkpoints if found
- n (n > 0) resuming from the nth checkpoint
To make prediction on a floorplan image, run
th predict.lua -loadModel "model path" -floorplanFilename "path to the floorplan image" -outputFilename "output filename"
Note that the above script will produce the vectorization result (saved in ".txt" file), the rendering image (saved in ".png" file), and a text file which could be used for generating 3D models (saved in "_popup.txt").
To evaluate performance on the benchmark, run
th evaluate.lua -loadModel "model path" -resultPath "path to save results"
Automatic 3D model generation based on our vectorization results is implemented in both C++ (under folder popup/) and Python (under folder rendering/).
For the C++ code, run the following:
cd popup/code/
cmake .
make
./popup_cli ../data/floorplan_1.txt
The data file (e.g., popup/data/floorplan_1.txt), which could be generated by predict.lua (*_popup.txt), has the following format:
width height
the number of walls
(Wall descriptions)
x_1, y_1, x_2, y_2, room type on the left, room type on the right
...
(Opening descriptions)
x_1, y_1, x_2, y_2, 'door', dummy, dummy
(Icon descriptions)
x_1, y_1, x_2, y_2, icon type, dummy, dummy
You could optionally use the corresponding input raster image or the final vector-graphics rendering as the texture image for the floor. To do so, please put the image under the data folder and rename it to the same name with the data file with suffix ".png" (e.g., floorplan_1.png).
The Python code is based on Panda3D. First enter folder rendering/, and then either run:
python viewer.py
to view a 3D model, or run:
python rendering.py
to render one view of the 3D model given camera pose. Please check the code to see how to specify the model to view and how to render different views.
If you have any questions, please contact me at chenliu@wustl.edu.