/3D_Computer_Vision

my computer vision study repository

Primary LanguagePython

3D Object Reconstruction with Multi-View RGB-D Images

With RGB-D cameras, we can get multiple RGB and Depth images and convert them to point clouds easily. Leveraging this, we can reconstruct single object with multi-view RGB and Depth images. To acheive this, point clouds from multi-view need to be registered. This task is also known as point registration", whose goal is to find transfomration matrix between source and target point clouds. The alignment consists of two sub-alignment, Initial alignment and Alignment refinement.

RGB-D Camera Spec

  • Model: Intel realsense D415
  • Intrinsic parameters in 640x480 RGB, Depth image.
K = [[597.522, 0.0, 312.885],
     [0.0, 597.522, 239.870],
     [0.0, 0.0, 1.0]]

Requirements

  • Open3D
  • Pyrealsense2
  • OpenCV
  • Numpy
  • Kornia

Align RGB and Depth Image & Depth Filtering

Due to the different position of RGB and Depth lens, aligning them should be done to get exact point clouds. This project used alignment function offered by pyrealsense2 package. Raw depth data are so noisy that depth filtering should be needed. Pyrealsense2 library, developed by Intel, offers filtering methods of depth images. In this project, spatial-filter was used that smoothes noise and preserves edge components in depth images.

capture_aligned_images.py : Capture RGB and Depth Image, align them Set image path for saving RGB and Depth images. Press 'S' to capture scene.

Pre-Process Point Clouds

Single object might be a part of the scanned data. In order to get points of interested objects, pre-processing should be implemented. Plane-removal, outlier-removal, DBSCAN clustering were executed to extract object. Open3D offers useful functions to filter points.

preprocess_pcd.py : RGB-D Images to Object's Point clouds. Plane-removal, points outlier-filtering, DBSCAN clustering were applied.

Feature based Registration (Local Registration)

Initial alignment can be acheived through finding transformation matrix between feature points. The position of 3D points can be estimated with back-projection rays and depth values from depth image. Transformation matrix can be estimated with 3D corresponding feature points from souce and target point clouds, with RANSAC procedure. In order to find robust correspondeces in object area, extracted object 3D points were reprojected and the bounding area were obtained to filter outliers.

SIFT.py: Find 3d transformation matrix with SIFT feature points
ORB.py: Find 3d transformation matrix with ORB feature points
LoFTR.py: Find 3d transformation matrix with LoFTR feature points

Reprjection Constraints

  • Before
    image
  • After
    image

ICP based Registration (Global Registration)

With ICP algorithm implemented in Open3D, refine initial transformation matrix. In this project, Point-to-Plane ICP method was used.

Pose Graph Optimization

Pose graph optimization is a non-linear optimization of poses, frequently used in SLAM. Node represents poses of camera, edge represents relative pose between two cameras(nodes). In loop closure, the error occurs between predicted and measurement in loop nodes due to inaccurate camera poses. The goal of pose graph optimization is to minimize error between loop nodes(X_i, X_j), shown below. Levenberg-marquardt(LM) method was used for non-linear optimization



Effect of Pose graph optimzation


You can find the difference of registration quality between unoptimized(left) and optimized(right) in reconstruction result. In unoptimized result, error exists due to accumulated pose error.

Results

The object was reconstructed with multiple different view of RGB-D Images.

The reconstructed point clouds is below.

Object 3D Reconstruction (Feature based)

Object 3D Reconstruction (ICP based)

Human shape 3D Reconstruction