thohemp/6DRepNet

RGB inputs or BGR inputs for model.predict(img)?

BryanGuillaume opened this issue · 4 comments

First, thanks for this great work and for sharing your code.

In the running example,

img = cv2.imread('/path/to/image.jpg')
pitch, yaw, roll = model.predict(img)

the image is loaded as a BGR numpy array, as it is the default mode of OpenCV. However, I think the model training has been done using RGB numpy arrays as the images were opened using PIL.Image.Open. Thus, I am wondering if we should convert the BGR arrays into RGB arrays before using them as input for the model.

Would you mind clarifying this?

Hello,

the predict method converts the image into RGB. So it's okay, to pass a BGR array.
https://github.com/thohemp/6DRepNet/blob/master/sixdrepnet/regressor.py#L67

Hello,

Thanks for your reply. I have already seen that in the code. The problem is the line above (https://github.com/thohemp/6DRepNet/blob/master/sixdrepnet/regressor.py#L66):

img = Image.fromarray(img)

that, I think, by default, converts an array with three channels into an RGB PIL image as it does not now which channel is what color. Thus img is already an RGB PIL image but with its R and B channels inverted. Convertimg it to an RGB PIL image should not invert its channels. Though I might be wrong. If I have some time today, I will add some print function around the code to check if I am correct or not

You are correct, I just tested it. Swapping channels before using PIL should fix this:

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Thanks!

Thanks for checking this and happy that helped fix a bug at the end!