r9y9/pylibfreenect2

Apply undistortDepth on numpy array instead of frame object

maderafunk opened this issue · 16 comments

Hi,

is it possible to create a function similar to undistortDepth that accepts an numpy array instead of a frame object? My problem is, that I have already saved many depth images as numpy arrays and now I need the undistorted depth arrays.

I solved it by converting the numpy array back as a libfreenect2.frame type. This was possible, as most of the frame data (exposure, gain, etc.) is not needed for the registration functions.

Would there be any interest integrating that code?

r9y9 commented

Cool. I'd be happy if you can create a PR.

My problem is, so far it only works when a kinect device is connected. it's better than nothing, but still not perfect.

To do the same without a connected device, I tried to pass the camera parameters offline to the registration class.
I saved the Color and IR CameraParameters in a dictionary and adjusted the ColorCameraParams and IrCameraParams classes with setters. Then I am passing those parameters to the Registration class. However, that doesn't seem to be enough. Do you have any idea, what else is needed?

cdef class ColorCameraParams:
    """Python interface for ``libfreenect2::Freenect2Device::ColorCameraParams``.

    Attributes
    ----------
    params : ``libfreenect2::Freenect2Device::ColorCameraParams``

    See also
    --------
    pylibfreenect2.libfreenect2.Freenect2Device.getColorCameraParams
    """
    cdef _Freenect2Device.ColorCameraParams params

    # TODO: wrap all instance variables
    @property
    def fx(self):
        """Same as ``libfreenect2::Freenect2Device::ColorCameraParams::fx``"""
        return self.params.fx

    @fx.setter
    def fx(self, value):
        """Sets fx parameter"""
        self.params.fx = value

    @property
    def fy(self):
        """Same as ``libfreenect2::Freenect2Device::ColorCameraParams::fy``"""
        return self.params.fy

    @fy.setter
    def fy(self, value):
        """Sets fy parameter"""
        self.params.fy = value

    @property
    def cx(self):
        """Same as ``libfreenect2::Freenect2Device::ColorCameraParams::cx``"""
        return self.params.cx

    @cx.setter
    def cx(self, value):
        """Sets cx parameter"""
        self.params.cx = value

    @property
    def cy(self):
        """Same as ``libfreenect2::Freenect2Device::ColorCameraParams::cy``"""
        return self.params.cy

    @cy.setter
    def cy(self, value):
        """Sets cx parameter"""
        self.params.cy = value

Implementation:

# get Camera Parameter
irCameraParams_df = pandas.read_csv("irCameraParams.csv")
colorCameraParams_df = pandas.read_csv("colorCameraParams.csv")

# initialize CameraParameters
irCameraParams = IrCameraParams()
colorCameraParams = ColorCameraParams()

# set CameraParameters
irCameraParams.fx = irCameraParams_df.fx[0]
irCameraParams.fy = irCameraParams_df.fy[0]
irCameraParams.cx = irCameraParams_df.cx[0]
irCameraParams.cy = irCameraParams_df.cy[0]

colorCameraParams.fx = colorCameraParams_df.fx[0]
colorCameraParams.fy = colorCameraParams_df.fy[0]
colorCameraParams.cx = colorCameraParams_df.cx[0]
colorCameraParams.cy = colorCameraParams_df.cy[0]


# NOTE: must be called after device.start() <----- What does this mean?

registration = Registration(irCameraParams,
                            colorCameraParams)

r9y9 commented

I think you forgot to set some of the camera parameters.

float fx, fy, cx, cy, k1, k2, k3, p1, p2
, say k1, k2, k3, p1, p2. Those are used in https://github.com/OpenKinect/libfreenect2/blob/83f88b4c09f0b00724ae65785abcd4f3eeb79f93/src/registration.cpp#L73-L85.

As for # NOTE: must be called after device.start() <----- What does this mean?,

# NOTE: must be called after device.start()
registration = Registration(device.getIrCameraParams(),
device.getColorCameraParams())

This is because camera parameters are retrived after device staretd.

Thanks. I was wondering about those paramters, however device.getColorCameraParams() only returns fx, fy, cx, cy.

I will try to add the other parameters to cdef class ColorCameraParams and IrCameraParams.

Bam, it works!

r9y9 commented

Nice!

There is only one thing left that's not so nice. When I convert my numpy array back to bytes format so that it can be given to the frame object, it works well when I do that in my python environment with:
byte_data = numpy_array.tobytes('C')

However, I would rather like to handle that internally within pylibfreenect2. But when I use the same function within pylibfreenect2, the conversion is not correct, I can see artifacts in the image. I guess this has to do with cython? Do you know how I could handle that?

r9y9 commented

I think cython can handle numpy array input directly. From one of my projects that uses cython, https://github.com/r9y9/pysptk/blob/abdad498a549c746012418be3a8f264bbaeb9a0e/pysptk/_sptk.pyx#L34-L40 works.

def acep(x, np.ndarray[np.float64_t, ndim=1, mode="c"] c not None,
         lambda_coef=0.98, step=0.1, tau=0.9, pd=4, eps=1.0e-6):
    assert_pade(pd)
    cdef int order = len(c) - 1
    cdef double prederr
    prederr = _acep(x, &c[0], order, lambda_coef, step, tau, pd, eps)
    return prederr

C function signature:

double _acep "acep"(double x, double *c, const int m, const double lambda_coef,
                    const double step, const double tau, const int pd,
                    const double eps);

So maybe worth trying something like below?

np.ndarray[float maybe?, ndim=2, mode="c"] byte_data

and passing the address of it:

self.ptr = new libfreenect2.Frame(
 width, height, bytes_per_pixel, &byte_data[0]) # we may need reinterpret_cast here
    def __cinit__(self, width=None, height=None, bytes_per_pixel=None,
            int frame_type=-1, np.ndarray[float, ndim=2, mode="c"] byte_data=None):
                   self.ptr = new libfreenect2.Frame(
                              width, height, bytes_per_pixel, 
                              reinterpret_cast[int32_t](&byte_data[0]))

You mean like that? It won't compile:

Cannot take address of Python object

Same without reinterpret_cast. I am not at all familiar with cython.

r9y9 commented

I will look into it.

r9y9 commented
diff --git a/pylibfreenect2/libfreenect2.pyx b/pylibfreenect2/libfreenect2.pyx
index 5734958..93dfdd0 100644
--- a/pylibfreenect2/libfreenect2.pyx
+++ b/pylibfreenect2/libfreenect2.pyx
@@ -228,7 +228,7 @@ cdef class Frame:
     cdef int frame_type
 
     def __cinit__(self, width=None, height=None, bytes_per_pixel=None,
-            int frame_type=-1):
+            int frame_type=-1, np.ndarray[np.float32_t, ndim=1, mode="c"] byte_data=None):
         w,h,b = width, height, bytes_per_pixel
         all_none = (w is None) and (h is None) and (b is None)
         all_not_none = (w is not None) and (h is not None) and (b is not None)
@@ -238,11 +238,22 @@ cdef class Frame:
 
         if all_not_none:
             self.take_ownership = True
-            self.ptr = new libfreenect2.Frame(
-                width, height, bytes_per_pixel, NULL)
+            # we may accept ndim=2 and convert it to ndim=1 here for convenience
+            if byte_data is None:
+                self.ptr = new libfreenect2.Frame(
+                    width, height, bytes_per_pixel, NULL)
+            else:
+                self.__instantiate_frame_with_bytes(
+                    width, height, bytes_per_pixel, byte_data)
         else:
             self.take_ownership = False
 
+    cdef __instantiate_frame_with_bytes(self, int width, int height,
+        int bytes_per_pixel, np.ndarray[np.float32_t, ndim=1, mode="c"] byte_data):
+        cdef uint8_t* bytes_ptr = reinterpret_cast[uint8_pt](&byte_data[0])
+        self.ptr = new libfreenect2.Frame(
+            width, height, bytes_per_pixel, bytes_ptr)
+
     def __dealloc__(self):
         if self.take_ownership and self.ptr is not NULL:
             del self.ptr

This compiles OK.

Thanks a lot, I will test it tomorrow.

This works well when I input a numpy array with dimension 1, eg.
np.reshape(numpy_array,(512*424))
That's fine for me, however, it would be convenient to have the possibility to input dimension 2 as well.

r9y9 commented

That's what I meant by the comment # we may accept ndim=2 and convert it to ndim=1 here for convenience.

I'm guessing relaxing input type from np.ndarray[np.float32_t, ndim=1, mode="c"] to np.ndarray[np.float32_t, mode="c"] and then calling byte_data.ravel() or byte_data.reshape(-1) just works.

That's it, I didn't understand the comment properly. I added it in the pull request #51