NVlabs/Deep_Object_Pose

interchanged x and y coordsmin train2's createBeliefMap()

wetoo-cando opened this issue · 3 comments

@mintar Why are the x and y coordinates of the key points interchanged in this line?

        p = [point[numb_point][1],point[numb_point][0]]

p = [point[numb_point][1],point[numb_point][0]]

something to do with col vs row major in opencv.

In OpenCV, and more generally in many image processing and computer vision libraries, images are treated as matrices or 2D arrays. The convention here is that the first index refers to the row (vertical position) and the second index refers to the column (horizontal position). This is why you often see im[j, i] instead of im[i, j].

Let's break it down:

Matrix Representation: In the context of matrices, it's standard to refer to positions with row and column indices, where the row index comes first. This is the convention used in mathematics for matrices and also adopted in many programming languages for 2D arrays. In the case of images, each pixel's location is thus identified first by its row (which corresponds to the y coordinate in a Cartesian system) and then by its column (x coordinate).

Rows and Columns vs. X and Y: In a Cartesian coordinate system, we're used to x (horizontal axis) and then y (vertical axis). However, in matrix notation, which follows the row and column approach, it flips to row, column (or y, x in Cartesian terms). This is because when dealing with matrices, the emphasis is on moving down through rows first, then across columns, which aligns with how images are processed and stored in memory (row by row).

Practical Example: If you want to access the pixel at the Cartesian coordinate (x=10, y=20) in an image using OpenCV, you would access it using image[20, 10], since the 20 (y value) corresponds to the 21st row (considering 0 indexing), and the 10 (x value) corresponds to the 11th column.

chat gpt 4's answer. I think it is better than I could have produced.

Ok thanks @TontonTremblay I'll close the issue for now.