VisionLabs/torch-opencv

missing perspectiveTransform

joshuahhh opened this issue · 3 comments

Hi! This library seems to be missing perspectiveTransform. Is there a particular reason this might be? (Or maybe there is just more work needed to wrap a lot more functions?)

Thanks a bunch for your work! 😃

Hi! Well, yes, there's a reason; we decided not to wrap core module of OpenCV because it mainly contains non-CV stuff that is already available in Torch or at least can be easily done with existing functionality: optimization algorithms, linear system solvers, SVD, common math operations, algorithms on vectors and matrices etc.

In particular, perspectiveTransform can be implemented as follows:

function perspectiveTransform(src, M)
    -- number of channels
    local C = M:size(1) - 1

    local vecPrime = src * M:t()[{{1,C}, {}}]
    for i = 1,C+1 do
        vecPrime[{{}, i}]:add(M[{i, C+1}])
    end

    local wPrime = vecPrime[{{}, C+1}]
    wPrime:apply(function(x) return x == 0 and math.huge or x end)
    for i = 1,C do
        vecPrime[{{}, i}]:cdiv(wPrime)
    end

    return vecPrime[{{}, {1,3}}]
end

This code did not work for me in transforming a contour (an Nx1x2 IntTensor), which is what I usually use perspectiveTransform for.

I'm using this code. It works for transforming contours, but I'm sure does not work for all the things that OpenCV's perspectiveTransform works for. Also I'm a Torch novice so this is probably not the cleanest or most efficient implementation.

local function perspectiveTransform(src, M)
  src = src:double()
  local result = src:clone()

  for i = 1, src:size(1) do
    local v = torch.Tensor(1, 3)
    v[1][1] = src[i][1][1]
    v[1][2] = src[i][1][2]
    v[1][3] = 1
    local vprime = v * M:t()
    local z = vprime[1][3]
    if z == 0 then z = math.huge end
    result[i][1][1] = vprime[1][1] / vprime[1][3]
    result[i][1][2] = vprime[1][2] / vprime[1][3]
  end

  return result
end

Yes, my code is for N x 3 or N x 2 Tensors, so one should squeeze it first.
It can be very simply changed to support both N x 1 x [3|2] and N x [3|2]:

function perspectiveTransform(src, M)
    if src:nDimension() == 3 then
        src = src:squeeze()
    end

    -- number of channels
    local C = M:size(1) - 1

    local vecPrime = src * M:t()[{{1,C}, {}}]
    for i = 1,C+1 do
        vecPrime[{{}, i}]:add(M[{i, C+1}])
    end

    local wPrime = vecPrime[{{}, C+1}]
    wPrime:apply(function(x) return x == 0 and math.huge or x end)
    for i = 1,C do
        vecPrime[{{}, i}]:cdiv(wPrime)
    end

    local retval = vecPrime[{{}, {1,3}}]
    return retval:reshape(retval:size(1), 1, retval:size(2))
end