RobustPlanarPose: A C++ repository from arthur1026

Pose Estimation For Planar Target
by Nghia Ho

http://nghiaho.com/?page_id=576

This is an OpenCV port of Robust Pose Estimation from a Planar Target (2006) by Gerald Schweighofer and Axel Pinz using their Matlab code from the link in the paper. It is used to determine the pose of a planar target. It can be used for Augmented Reality to track a planar target, such as a business card.

The algorithm is very robust and does not require accurate calibration of the camera for Augmented Reality applications. I found that a rough guess for the FOV of my webcam worked.

Changes from last version

Added a check to see if the root finding code returned any valid solutions (thanks to csantos)
Requirements

OpenCV 2.x or later

Usage

On Linux, type make to compile the demo code and ./demo to see how it works. On Windows you have to make your own project.

Note: For Visual Studio users you might get a compiler error about undefined M_PI_2, please visit http://msdn.microsoft.com/en-us/library/4hwaceh6.aspx for a fix.

To use this code in your own program you only need to call the function

RPP:Rpp(const cv::Mat &model_3D, const cv::Mat &iprts, cv::Mat &Rlu, cv::Mat &tlu, int &it1, double &obj_err1, double &img_err1);

A description of the parameters are as follows:

model_3D are the fixed 3D points of the object you want to track, they should never change throughout the program. For example, say we have a square card we want to track, we can arbitrarily choose (-1,-1,0), (-1,1,0), (1,1,0), (1,-1,0) as the 4 corners point in 3D space. The values you choose will determine the reference frame for which the estimated pose will be relative to. Using the same example, (0,0,0) in this case will be at the centre of the square card.

iprts are the normalised image points. You will need to form a typical 3×3 camera matrix to normalise the points, which we shall call K. K typically looks like

    [ fx 0  cx ]
K = [ 0  fy cy ]
    [ 0  0  1  ]

fx/fy are the focal lengths (in pixels) and cx/cy are the optical centres. For Augmented Reality application, I find you can get away with rough approximations. For example, say the webcam has a horizontal FOV of about 60 degrees and the image size is 640×480. Then the 4 parameters are:

cx = 640/2 = 320
cy = 480/2 = 240
fx = fy = cx/tan(60/2 * pi / 180) = 554.26

Now that you have K, to normalise the points simply do

[ x' ]         [ x ]
[ y' ] = K^(-1)[ y ]
[ 1  ]         [ 1 ]

where x and y are the un-normalised image points

Rlu is the returned 3×3 rotation matrix. If you pass in a valid 3×3 matrix the it will be used as the initial guess of the rotation. In a tracking scenario you would continuously feedback the returned Rlu as input for the next frame.

tlu is the returned 3×1 translation vector

it1 is the number of iterations done

obj_err1 is the object projection error in 3D

img_err1 is the image projection error, in normalised image point space

License

Simplified BSD License, unless superseded by a license from the original Matlab code or the copy and paste code for Rpoly.cpp.
arthur1026/RobustPlanarPose