spencer-project/spencer_people_tracking

Using upper-body detector with Kinect v2?

Closed this issue ยท 34 comments

37 commented

Hey there,
I'm trying to get the spencer framework working with a kinectv2 as my RGB-D camera, using just the rgbd_front_top detection and tracking capabilities (sadly I cannot afford a laser sensor).

So far I've got my kinect2 working nicely with the iai_kinect2 bridge between PCL and Freenect2, and I have my spencer launcher running the iai bridge successfully by tweaking my openni2 launcher in /launch/spencer_people_tracking_launch/launch/tracking_single_rgbd_sensor.launch to this:

    <arg name="load_kinect2" default="true"/>  <!-- set to false if you want to run OpenNi2 -->

    <!-- Run iai_kinect2 driver -->
    <group ns="spencer/sensors" if="$(arg load_kinect2)">
        <include file="$(find kinect2_bridge)/launch/kinect2_bridge.launch">
          <arg name="base_name" value="rgbd_front_top" />
          <arg name="publish_tf" value="$(arg publish_frame)" />
        </include>
    </group>

When I run

roslaunch spencer_people_tracking_launch tracking_single_rgbd_sensor.launch height_above_ground:=1.6

the launcher will start rviz and I can select the RGB-D sensor and set the topic to

/spencer/rgbd_front_top/sd/points

and my cloud will display. By default though it's trying to load:

/spencer/rgbd_front_top/depth_registered/points

Which shows up empty in rviz. I'm also having trouble getting any of the detection or tracking features working, even when switching to "/spencer/rgbd_front_top/sd/points" as my rgb-d topic . Do you have any tips on how I might be able to fix this?

Any help would be greatly appreciated, also I love your framework, thanks and great work!

Edit: I've got rviz running with the kinect /rgbd_front_top/sd/points topic, though am still having issues getting my detectors to run! I'm not sure if it might have something to do with my camera orientation? See:

screenshot from 2016-03-13 22 58 48

Cross-linking to strands-project/strands_perception_people#183 which is about the same thing, where there were a few answers but no final solution.

Remapping topics is not sufficient. The source depth image is in the wrong format.

You need to down-sample it to 640x480 (image needs to be cropped too to prevent image stretching). And you need to convert the image encoding from 16UC1 to 32FC1.

With a bit of luck I can provide a working solution in the next week.

That sounds right. Note that the format conversion is probably not just typecasting, but also converting millimeters to meters, though I'm not 100% sure. A PR would be nice!

Oh yeah, forgot about that. You need to convert from 16UC1 (millimeters) to 32FC1 (meters).

I have a working version here that plugs right in the middle between iai_kinect2 and spencer (without changing any of the two). But that needs to be approved for public release by my employer (some legal stuff we have to figure out ourselves first ^^).

I'll notice you anyway.

Cool, thanks for considering giving back!

Hi there,
I already tested the Spencer framework and everything worked very well with the kinect v1 and with a RPLIDAR.
Now i really want to test this framework but using only the upper-body detector with a sick 3vistor-T. This depth camera is similar to the depth sensor of kinect, but with an higher range (max 7m) and lower resolution (176x144). In my case i pretend to up-sample it to 320x240 ( resolution of depth sensor of kinect) to test the upper body detector. Anyone knows how to do that or have any suggestion ?

By the way, for the guys with the kinect v2, i dont know if that helps but I found this package that allows to down-sample an image.I already tested it with rgb camera (kinect) and seems to be working well. But need some modifications to work with the depth image.

Thanks

My guess would be that you should up-sample by repeating pixels aka. use nearest-neighbour upsampling since any interpolation of depth values around edges doesn't make any sense and just introduces spurious depth measurements. Though probably both still work in practice.

Hi,

Thank you fo @lucasb-eyer for your tip. I will try that.

Cheers

Hi again,

I already up-sampled it to the resolution of the kinect v1 (320 x 240), and i still not having any result. With the kinect it works very well but with the sick vistor-t still not detecting anything. I am only using the upper body detector, so i am only providing the depth image to the SPENCER framework.

I think it is not a tf problem because i gave the same frame name to my sick camera and launched the kinect frames. When i make tracking_single_rgbd_sensor.lauch i can add a new camera and see the published image by the camera, with the frame odm.

I compared the image of the sick vistor camera with the image from kinect at different distances. I started with 1m and stoped with 5m of distance. The intensity of each pixel at the same distance is diferent for both cameras.

Any advice or suggestion?

Thank guys

1m of distance

1m

2m of distance

2m

3 m of distance

3m

4m of distance

4m

5m of distance

5m

Hmm, hard to say what more could be missing.

  • I believe it's important for the depth values to be 32FC1 (meters) as noted by @docmop above, so you should check that, as you say the raw values differ.
  • The detector is quite sensitive to the angle between the camera and the ground-plane being correct. Play with the sensor's orientation a little, or check out the detector's config file. (I don't have experience playing with the config myself, but colleagues were able to get a lot of improvement that way.)

Other than that, I'm at a loss of ideas without sitting in front of the robot myself, sorry!

Thank you for your help @lucasb-eyer. It was very helpful all you said.

I will check that parameters and i will say something when i have more news.

Cheers

Hi,

I already converted the 16UC1 in mm to 32FC1 m, and resized the image to the same resolution of the kinect. It works like a middleware beetween the spencer framework and the camera. I tested the converter with the image_raw of kinect (because it is in 16uc1) with the spencer framework and the people were detected very well. @lucasb-eyer i will do some tests and i will share my code with all of you.

But i still not having any result with my sick camera. I thought my tf are ok but when i tested with the image from kinect and the image from my sick camera the grid from rviz on "odm" frame is in different position. So anyone think that i am having a tf problem?

My frame of my sick camera is defined = "rgbd_front_top_depth_optical_frame". In both case i used the tf from launch file of spencer. the publish_tf is true.

<!-- Load reasonable defaults for the relative pose between cameras -->
  <include if="$(arg publish_tf)"
 file="$(find rgbd_launch)/launch/kinect_frames.launch">
	<arg name="camera" value="rgbd_front_top" />

A attached both images from the two cameras to help.

Kinect image_raw already rectified to 32fc1

kinect_frame

sick image

sick_frame

Hi,

@miguelcrf450 , I am using an intel realsense zr300 and would like to do the same thing as you have done, converting the depth image from 16UC1 in mm to 32FC1 m and resize the image to same resolution as the kinect.
Just wondering how you went about doing this?
I have tried using the openCv convertTo, but this doesn't seem to be working. I am relatively inexperienced with openCv so some pointers on how to do this would be much appreciated.

Hi,

I need to test a few more things but i already had some results with my camera. For the conversion i am using this package

[https://github.com/ros-perception/image_pipeline].

And to resize the image i am using a opencv function to doing that.

Next week i will have more time to do some test and maybe i can help you.

@miguelcrf450 That looks like it's your problem: the detector relies heavily on a correctly calibrated ground-plane vs. camera angle, because it projects all points "on the ground". You'll have to fix that in order for anything to work.

any update about using upper body detector with kinect2 stream?

I believe @sbreuers got it running, but I'm not sure, haven't worked on this myself anymore in a long time.

Hi:
@miguelcrf450 You said that you are using the package image_pipeline to the conversion from 16UC1 to 32FC1, can you tell me the detail how to do this work? I try to use OpenCV to convert and a nodelet to pack them together, and I fail, it can't be work. So how to convert with the package directly?

@CXYch
If I remember correctly, you don't need any special package for the conversion. As far as I remember 16UC1 specifies that the distance value of one pixel is stored in an 16bit unsigned integer (distance is in millimeters) whereas 32FC1 says that the that one pixel is a 32bit float (distance in meters).

Just do something like this for the pixels:

for(int i = 0; i < amount_of_pixels; i++) {
   32FC1_data[i] = 16UC1_data[i] / 1000;
}

Edit:
Somehow GitHub was messing with the comment.

@miguelcrf450 could you share your solution?

I'm trying to get the detector to work using an intel realsense d435. I'am converting the depth image to 32FC1 format using opencv (cv_bridge::toCvCopy (depth_msg, sensor_msgs::image_encodings:: TYPE_32FC1) but it's not detecting anything.

I have been trying now for really long. Has anyone found a solution?

@dimaw24
Did you make sure that the 3d "picture" has the right resolution?
I had to crop the image to a lower resolution because this software could only handle 640x480.
But that was long time ago. Don't know if this still applies.

@dimaw24, I've just managed to get detection running with a realsense D435i.

For me it was a matter of converting the depth image encoding, and ensuring all of the necessary topics were mapped appropriately (basically what people have been saying in this thread).

It took me a fair amount of time to get it running, because when I first tried converting the depth image encoding I forgot to add a timestamp to the new image's header, so the upper body detection callback was never triggering (it uses a message time synchronizer applied to the depth, camera info and ground plane topics). Once I corrected that I started seeing detections in RViz.

For reference, here's the node I am using to do the encoding conversion:
https://gist.github.com/tim-fan/a2abe1fe15cce8111d06c5c2187c7e97

and here's the launch file I use to run it:
https://gist.github.com/tim-fan/cbddef64281a5ca5bc318e1653cbf767

@tim-fan
Hi,

Thank you so much for sharing the files. I am facing exactly the same problem, but I still cannot get things done with modification with the above two files. Is that you combine the spencer project and realsense package together and put both .py and .launch files into the /spencer_people_tracking/track file?

Thanks

Hi @ShirleyZYJ

I've since put those scripts into a ROS package, available here:
https://github.com/tim-fan/realsense_spencer_adaptor

So you should be able to clone that into your workspace, catkin make, source devel/setup.bash, then run
roslaunch realsense_spencer_adaptor realsense_people_tracking.launch

If that doesn't work, let me know what error you see when trying to launch.

Hi @tim-fan

It works!

Thank you so much

Hi @ShirleyZYJ

I've since put those scripts into a ROS package, available here:
https://github.com/tim-fan/realsense_spencer_adaptor

So you should be able to clone that into your workspace, catkin make, source devel/setup.bash, then run
roslaunch realsense_spencer_adaptor realsense_people_tracking.launch

If that doesn't work, let me know what error you see when trying to launch.

I'm using your launch files and it works .
Thanks a lot.
I also use the realsense D435 and I find out the D435 doesn't have rgb/image_rect_color topic, can you please tell me which topic you remapped to rgb/image_rect_color? I'm using color/image_raw now but it seems not work really well.

@fly-duck Hello, may I know which topic you remap for rgb/image_rect_color?

I found the issue that you posted in (url) and seems the rgb/image_rect_color is published from spencer_people_tracking package.

May I know how you solve it?

@Alex-Beh Hi, I'm using the topic aligned_depth_to_color/image_raw and it works but the performance is not so good( many false positives), and I not sure it's caused by the person detection algorithm itself or the topic. You could try other depth_image
topic by the way

@Alex-Beh @fly-duck
You should use the depth image as input for spencer.

Also make sure it has the correct size. I don't know the the correct values from memory and can not look it up right now unfortunately.

Edit:
I had some middleware ready to plug between iai_kinect2 and spencer_people_tracking. I need approval from my employer to release it directly unfortunately. If I can't do that, I should at least be able to get you on track as soon as I have time to stick my head inside this (surely not so good) code I wrote years ago ;D

@fly-duck May I know where you specific the input topic to spencer? Do you have recorded rosbag file using D435 for me to do debugging and testing?

@ph-hoffmann Yaa I am looking forwards to your help. Please let me know if you have any updates.

@fly-duck @Alex-Beh
There ya go: https://github.com/emrolab-htwsaar/iai_kinect2spencer
Please note that this repo is not under active maintenance. However, if you have any questions, feel free to ask.

@Alex-Beh
You could roslaunch the package in this https://github.com/tim-fan/realsense_spencer_adaptor and modify the subscribed topic to aligned_depth_to_color/image_raw in that realsense_spencer_adaptor.py file. Sorry, I have not recorded any robag files .

tlind commented

The latest version of the upper-body detector (contributed by PR #59) now supports several depth formats, depth and image scaling and includes a dedicated config file for Kinect v2-like devices with 512x412 depth resolution. Therefore, I'll consider this issue as resolved.