scijs/get-pixels

Incorrect colors when loading JPEG images.

Closed this issue · 7 comments

When loading a JPG with get-pixels, colors are, more often than not a little bit off. Take this simple image and code, for example:

jpg_test

var fs = require('fs');
var getPixels = require('get-pixels');

getPixels("png_test.png", function(err, pixels) {
    for (var pixel = 0; pixel < 3; pixel++) {
        i = pixel * 4;
        console.log("R: " + pixels.data[i] + ", G: " + pixels.data[i + 1] + ", B: " + pixels.data[i + 2]);
    }
});

All it does is print out the what the RGB values of the first three pixels are, according to get-pixels. The result is:

R: 231, G: 183, B: 105
R: 212, G: 166, B: 103
R: 207, G: 162, B: 117

These colors are, in fact, all wrong. The actual colors of the first three pixels are (230, 183, 103), (211, 165, 103), (206, 162, 117). Most of the values are off by one or two! This is a bit of a problem, especially when doing color analysis. I assume it's due to a rounding error or a mistake in the JPEG interpretation algorithm somewhere.

I also tested with a PNG - the same small errors did not appear there, so I assume the problem is JPEG-specific.

It could be a bug in the JPEG decoder. Also JPEG is a lossy format, so you shouldn't expect to get the same colors out as you put in.

Right, if so it's the decoder get-pixels uses, not the encoder whatever saved the image used. The "actual colors" I presented are the actual colors of the JPEG, when a 100% working deccoder is used.

The issue is in jpeg-js then. If you have a better decoder let me know and I'll update the dependency.

if you need exact pixel values every time, where 1-2 off errors matter, then jpeg is the wrong format. By definition it is a lossy format, and the expectation from different codecs is a perceptually similar result (to the point the naked eye can't tell the difference), *not the excact same output every time. If a certain codec vendor decided that they could get what they consider better results by adding a random number to each pixel for some reason, that would be acceptable.

You should be using a lossless format where the exact result, pixel-to-pixel, is encoded into the data. PNG is the most likely candidate format, here.

@mikolalysenko I recommend closing this issue.

@gunderson Could you source that? I asked on StackOverflow a while back if JPEGs have canonical representation, or if the spec allows for some leeway here and there, but I never got an answer.

It's not about whether compression is lossy or not - if the same file differs between jpeg-js and more mainstream decoders, that might be a problem. Of course the JPEG spec has some limit as to how inexact representations can be - otherwise just displaying a single color and calling it "just a bit inexact" would be up to spec.

If JPEGs do have an exact canonical representation, this should be fixed. If not, well, sure. Small discrepancies between decoders are fine for most cases.

Here is the actual spec https://www.w3.org/Graphics/JPEG/itu-t81.pdf

The relevant sections are 6 & 7 (p 23). The requirement for any codec is only encode/decode to "appropriate accuracy". There is no canonical representation of the compressed data or decompression from that data.

You may expect that the same codec will produce the same i/o but different codecs likely will not, nor are they even suggested to do so in the spec. As JPEG is a lossy format, the nature of it means that data is discarded (effectively at random as far as humans are concerned), so it is understood that the input and output won't match up exactly, but instead perceptually.

@gunderson Thank you very much for the reference! Wonderful. "Appropriate accuracy", huh. I guess this is fine, then.

Oh, and it may seem like I'm being nit-picky, but even lossy formats may have rigid requirements for decoding. Even if it doesn't match the raw image, for some lossy formats there may still be an objectively correct representation.