RobotLocomotion/LabelFusion

Can I use Kinect v2 for acquiring the data?

antoniocmdias95 opened this issue · 3 comments

If yes, how so?

Thank you.

Hello,@patmarion #93
I figure out how to label my own object with Azure Kinect and output the 8 vertices of 3d bounding box with good accuracy.
0000000006_rgb

0000000006_color_labels
0000000025_rgb
0000000025_color_labels

1. Firstly

you need to use lcm logger record your camera's rgbd stream.
you need to check the resolution of lcm-republisher. for me 1280 * 720 -> 640 * 480 and pre-calculate the intrinsics of the new image.

in lcm_republisher.cpp. crop and resize if you are not 640 * 480

     else if (rgb_msg->encoding == sensor_msgs::image_encodings::BGRA8)
    {


      cv::cvtColor(rgb, rgb, CV_RGBA2BGR);
      //center_crop here
      cv::Rect rect(160, 0,960,720);
      cv::Mat roi= rgb(rect);
      cv::resize(roi,rgb,cv::Size(640,480));
    }
    else
    {
      ROS_ERROR("Unexpected image encoding %s in input RGB image, only bgr8 and rgb8 and rgba8 encodings are supported.", rgb_msg->encoding.c_str());
      ros::shutdown();
    }

    //$ check depth encoding
    if (depth_msg->encoding == sensor_msgs::image_encodings::TYPE_16UC1)
    {
      if (debug_print_statements_)
        ROS_INFO("Depth image format is 16UC1");
    }
    else if (depth_msg->encoding == sensor_msgs::image_encodings::TYPE_32FC1)
    {
      if (debug_print_statements_)
        ROS_INFO("Converting depth from 32FC1 to 16UC1");

      //$ convert to 16UC1
      depth.convertTo(depth, CV_16UC1, 1000.0); 
      //center_crop
      cv::Rect rect(160, 0,960,720);
      cv::Mat roi= depth(rect);
      cv::resize(roi,depth,cv::Size(640,480));

    }
    else
    {
      ROS_ERROR("Unexpected image encoding %s in input depth image, only 16UC1 and 32FC1 encodings are supported.", depth_msg->encoding.c_str());
      ros::shutdown();
    }

    publishLCM(timestamp, rgb, depth);
  }

2. Secondly

change the intrinsics in bot_frames.cfg ( I'm not sure for the necessary) like this. using the pre-calculate intrinsics

 cameras {

  OPENNI_FRAME_LEFT {
    lcm_channel = "OPENNI_FRAME_LEFT";
    coord_frame = "OPENNI_FRAME_LEFT";
    intrinsic_cal {
      width = 640;
      height= 480;
      # width = 1280;
      # height= 720;
      distortion_model = "plumb-bob";
      distortion_k = [0,0,0];
      distortion_p = [0,0];
      # pinhole = [ 528.0, 528.0, 0, 320.0, 240.0 ]; # fx fy skew cx cy
      pinhole = [ 406.340666, 406.388667,0,321.667,242.333 ]; # fx fy skew cx cy
	  # example calibrated values
      # pinhole = [ 528.01442863461716, 528.01442863461716, 0, 321.06398107, 242.97676897 ];
    }
  }
}

3.Thirdly

you need to write a camera.cfg using your new intrinsics. make sure elasticfuion run with the following param -cal ***/camera.cfg

# call ElasticFusion
cameraConfig_arg = ""
if os.path.isfile(path+"/camera.cfg"):
   cameraConfig_arg += " -cal  "+ path+"/camera.cfg"
print 'cameraConfig_arg', cameraConfig_arg
os.system(path_to_ElasticFusion_executable + " -l ./" + lcmlog_filename + cameraConfig_arg)

4. Fourthly

just run the pipeline as the author's routine.

if you want to label the 3d bounding box like me like the following picture.
you may need to calculate the projection 2d point as the green point.
image

In rendertrainingimages.py implement the projection operation.

   def getProjectionPixel(self,verterx3d):
       vertex3dList=[]
       for i in range(verterx3d.GetNumberOfPoints()):
           point = verterx3d.GetPoint(i)
           vertex3dList.append(point)
       vertex3dArray=np.array(vertex3dList)
       vertex3dArray=vertex3dArray.transpose()
       projectPixels=np.matmul(self.projectMatrix,vertex3dArray)
       projectPixels=projectPixels/projectPixels[-1,:]
       return projectPixels[:-1,:]


def setProjectMatrix():
   principalX = 321.543
   principalY = 242.333
   fx=406.340666
   fy=406.3886
   return (np.array([[fx,0,principalX],[0,fy,principalY],[0,0,1]]))

Just leaving my commets for getting results with a ZED camera (HD720: 1280x720 or VGA: 672x376).

########################
# run_create_data
########################

cd ~/labelfusion/modules/labelfusion/
vim rendertrainingimages.py

view.setFixedSize(672, 376)

def setCameraInstrinsicsAsus(view):
	principalX = 326.785
	principalY = 168.432
	focalLength = 338.546630859375 # fx = fy = focalLength
	setCameraIntrinsics(view, principalX, principalY, focalLength)

########################
# camera config
########################

cd ~/labelfusion/config/
vim bot_frames.cfg

width = 672;
height= 376;
pinhole = [338.546630859375, 338.546630859375, 0, 326.785, 168.432]; # fx fy skew cx cy

########################
# ElasticFusion
########################

# for cuda

cd ~/ElasticFusion/Core/src/
vim CMakeLists.txt

# delete the following line:
# set(CUDA_ARCH_BIN "30 35 50 52 61" CACHE STRING "Specify 'real' GPU arch to build binaries for, BIN(PTX) format is supported. Example: 1.3 2.1(1.3) or 13 21(13)")

cd ~/ElasticFusion/Core/
rm -rf build && mkdir build && cd build && cmake ../src/ && make -j20

# for camera params

cd ~/ElasticFusion/GUI/src/
vim MainController.cpp

Resolution::getInstance(672, 376);
Intrinsics::getInstance(338.546630859375, 338.546630859375, 326.785, 168.432);

cd ~/ElasticFusion/GUI/
rm -rf build && mkdir build && cd build && cmake ../src/ && make -j20

########################
# pipeline
########################

cd ~/labelfusion/data/logs/006_mustard_bottle_01/
run_trim
run_prep
run_alignment_tool

gr.rotateReconstructionToStandardOrientation()
gr.segmentTable()
gr.saveAboveTablePolyData()
gr.launchObjectAlignment("006_mustard_bottle")
gr.saveRegistrationResults()

########################
# create data
########################

run_create_data -d

LabelFusion

I have gone through the steps of making LabelFusion work with Kinect v2 (aka Kinect One) using ianre657's nvidia-docker2 fork on Ubuntu 20.04 LTS.

By default, OpenNI2, which is used to capture the data in LabelFusion, does not support Kinect v2. However, if you build and install libfreenect2, it will create drivers for OpenNI2 that will support Kinect v2. This means that you need to install libfreenect2 and set the udev rules inside the docker. Here is someone who has already done most of the work for you. Copy their dockerfile and paste the relevant parts inside the LabelFusion dockerfile. If you have trouble with CUDA in building libfreenect2, add -DENABLE-CUDA=OFF to the cmake parameters to disable it. I also had to correct the cp path for the udev rules since libfreenect is downloaded and built in the home (/root) directory (confusingly named "root") instead of the root (/) directory.

Assuming your system is the same as mine, just that was enough to run the sample Protonect from freenect2 but running the NiViewer2 from OpenNI2 failed to find the Kinect v2 device. What I had to do was find the OpenNI.ini file (/etc/openni2/OpenNI.ini) and specify the driver directory as the ones provided by freenect2 (Repository=~/freenect2/lib/OpenNI2/Drivers). (For whatever reason, this was not a required step when installing it natively on my Ubuntu 20.04 system). You can probably add this line after building everything else in the dockerfile RUN echo "Repository=~/freenect2/lib/OpenNI2/Drivers" >> /etc/openni2/OpenNI.ini.

Finally, rebuild the LabelFusion image using docker_build.sh

I am attaching my dockerfile for reference
labelfusion.dockerfile.txt

Adjust the camera intrinsics and resolution similar to instructed above.

Run your docker container with the rebuilt image, mounting your data folder:

mkdir ~/LabelFusion/data
cd ~/LabelFusion/docker
./docker_run.sh ~/LabelFusion/data

Inside the docker, run openni2-camera-lcm, and it should connect to the Kinect v2. To log the data, open a separate terminal tab, execute docker exec -it <docker-id> bash, and source the startup file: source ~/labelfusion/docker/docker_startup.sh && /bin/bash. In this tab, execute lcm-logger. When done, press ctrl+c.