Missing columns in RGB565 but present in RGB8888
toneck opened this issue · 18 comments
When trying to convert an image using the user provided buffer it works ok for the RGB8888, but has missing columns for RGB565. After some testing it happens when using the draw callback as well.
To Reproduce
- clone repo
- modify linux/main.c: ucPixelType = RGB565_BIG_ENDIAN //or little endian
- convert a file
- image with column mising
also some columns are switched. Notably columns that start at multiple of 128 on the x. Column 640->512 -> 384 -> 256 -> 128 ->0
Expected behavior
A clear image as it appears with RGB8888
Additional context
I was using your library to convert some images on an ESP32-cam since last year. (it was significantly faster than the provided library from Espressif) I needed RGB888 so I modified the code to produce RGB888 instead of RGB565.
I noticed in January that you added RGB888 and tried the linux/main example. I saw you added the option of supplying a buffer from the user. Hoping to skip the drawcallback I tried your example on pc first and it worked. I wanted to test my code with rgb565 to see if I get a higher fps so I tried to change from RGB8888 to RGB565 and it kept giving me empty columns. I tried using the callback function to save the data to a buffer and save that to a file. The resulted file had no missing columns.
I tried looking over the jpeg.inl to locate the problem, but I didn't notice anything wrong. I presumed I was doing something wrong so I stayed with the callback and RGB8888.
I noticed the issue #69 and thought it may be similar, but the latest fix didn't work for me.
Today I decided to try using a callback function again, but now I see the columns when using the callback and when using the buffer. I'm not sure if before I did something different or if this is a new bug.
To test that I didn't do anything wrong, I tried a few previous commits, using the same callback function that just saves everything into a buffer. All tests are done with the tulips header image.
Commit e6293f0 works as intended.
Commit 6dd0375 segfaults after a few rows of decoding the image in RGB565.
Commit f6a46aa outputs columns.
Also, there is no difference between RGB 565 little or big endian for the bugged commits, they both output little endian. This problem doesn't appear in e6293f0.
Are you running the code on an ESP32 or ESP32-S3? If the S3, then it looks like a memory alignment issue with the output buffer.
I also think it's an alignment issue.
I didn't have my esp on hand, so I ran it on my phone in termux (aarch64)
That being said, I am quite sure I had this problem on my laptop (64bit intel) about a month ago.
I'll be able to test it on the esp in about 2 hours
Ok, so, I was able to test it both on my laptop (amd64) and on my esp32.
It appears to work on both of them. The problem seems to be only on aarch64.
I am curious if it would appear on a raspberry pi.
Thanks for narrowing it down; I'll investigate.
Fixed the NEON code. Please confirm.
I didn't have time yesterday, but I managed to test it today.
The columns are not missing anymore.
Thanks for the help
On the other hand, the bug with the RGB565 not outputing big endian is still present.
I did some tests and this bug affects just decoding at full scale.
Decoding at half, quarter or eighth will output little endian for RGB565_LITTLE_ENDIAN and big endian for RGB565_BIG_ENDIAN
Decoding at full size will always output little endian, even for RGB565_BIG_ENDIAN.
Should I close this bug and open a different bug for that one?
I can fix that issue here too. I'll get to it today.
Please give it a try now
Thank you for your help, unfortunately, it gives a compilation error
In file included from main.c:11:
../src/jpeg.inl:3565:26: error: incompatible type for argument 1 of ‘vrev16q_u8’
3565 | vrev16q_u8(u168Temp);
| ^~~~~~~~
| |
| uint16x8_t
In file included from ../src/jpeg.inl:56,
in jpeg.inl:
u88B = vqrshrun_n_s16(i168B, 4); // shift right, narrow and saturate >
u168Temp = vshll_n_u8(u88R, 8); // place red in upper part of 16-bit >
u168Temp2 = vshll_n_u8(u88G, 8); // shift green elements to top of 16>
u168Temp = vsriq_n_u16(u168Temp, u168Temp2, 5); // shift green elemen>
u168Temp2 = vshll_n_u8(u88B, 8); // shift blue elements to top of 16->
u168Temp = vsriq_n_u16(u168Temp, u168Temp2, 11); // shift blue elemen>
if (ucPixelType == RGB565_BIG_ENDIAN) { // reverse the bytes
vrev16q_u8(u168Temp);
}
vst1q_u16((uint16_t *)pOutput, u168Temp); // top left block
// top right block
i168Temp = vqdmulhq_lane_s16(i168Crx2.val[1], i164Constants, 0); // C>
i168R = vaddq_s16(i168Temp, i168Y); // now we have 8 R values
i168Temp = vqdmulhq_lane_s16(i168Crx2.val[1], i164Constants, 1); // C>
u88R = vqshrun_n_s16(i168R, 4); // narrow and saturate to 8-bit unsig>
i168G = vaddq_s16(i168Y, i168Temp);
interesting; on MacOS that doesn't give a warning or error. I'll fix it - thanks.
I think the problem is that the variable is declared as uint16x8_t
uint16x8_t u168Temp, u168Temp2;
it appears that vrev16q_u8 expects a uint8x16_t : link
yes, of course I see the issue, but some compilers allow you to pass 128-bit registers without making a fuss about the implied type.
ok, I added the requisite vreinterpretq_xx_xx wrappers. Please let me know if your compiler accepts it now.
Is the fix working for you?
Hi again,
Sorry for the very late reply. I got busy and forgot.
I just checked all three formats RGB565 little and BIG endian and RGB888 for all scalings and they work as they should.
Thank you for your help