BIDS/colormap

Question for going from viridis color to scalar value

Closed this issue · 10 comments

Apologies if this is the wrong place to ask, but since this seems to be the origin of viridis, it seemed a good place.

If I already have viridis-encoded data, what's an easy way to convert a given color to the scalar value that produced it? in other words, if viridis(x) gives (r,g,b), how would you write an inverse_viridis(r, g, b) that gives x (assuming the range of x is known to be [0,1]).

A brute-force method would be to find the closest rgb value in the _viridis_data list to your input rgb, and then do a linear interpolation between the two adjacent values in the data list to get the exact value of x. However, this seems quite slow, and given the structure of viridis it looks like there could be assumptions to make for a simpler algorithm that doesn't involve a brute force search?

Given that this is the new default for matplotlib, it's quite possible a lot of viridis-encoded data will end up out there in the wild with raw data being lost or inaccessible without reconstructing it, so this kind of function might be a good thing to have.

Thanks @endolith, that works perfectly -- it does seem to lose information though, since the output is only 8-bit? If I've understood your script correctly, this change should restore that lost information: https://gist.github.com/mkhorton/c5070520fef317ba7a3d8075d0c499ea/revisions

An efficient and general approach is to use a KDTree, like for example is used in this blog post: https://carreau.github.io/posts/24-Viridisify.html#Viridisify

(That post is extracting values from jet so they can convert to viridis, but the principle is the same no matter what color map you start with.)

There isn't really any special structure to viridis that I can see how to exploit -- it doesn't have a small closed-form representation in RGB space or anything like that. But using KDTree is pretty fast.

@mkhorton: images are only ~8 bit anyway :-). Maybe you can squeeze an extra bit out if you're lucky enough that the original plotting tool preserved that information (matplotlib doesn't). But it's also quite common to see colormaps discretized to like, 3 bits.

Thanks @njsmith, this is very useful -- I was under the mistaken idea that images being 8-bit/channel, that an 'ideal' colormap would in principle encode 24-bits, but thinking about it, though the colorspace volume is big, the colormap is only a 1-D path through that volume, so I can see why it'd only be 8-bits practically.

(I'm looking into this to try and find a file format to lossily compress some scientific voxel data for storage/visualization, but still wanted to preserve as much information as possible. Encoding slices of this data as a viridis-image and compressing the slices together as a movie seemed a nice way to do it, provided there was a good way of decoding the data back! 8 bits should hopefully be plenty.)

Encoding slices of this data as a viridis-image and compressing the slices together as a movie seemed a nice way to do it

Just be aware that fancy lossy compression algorithms can introduce structured artifacts. Probably the most extreme and famous case is the Xerox copiers that rearrange numbers, but in general movie compression algorithms will do all sorts of stuff that your eye doesn't notice, but that later data analysis code might. Or might not!

only a 1-D path through that volume, so I can see why it'd only be 8-bits practically.

Do you mean that there are only 256 colors in the lookup table? That's the default in MPL (cm.get_cmap('viridis').N), but it can be increased when you create a colormap, which does reduce visible striations on slow gradients.

Would probably be better if MPL used a higher N for its colormaps.

most extreme and famous case is the Xerox copiers that rearrange numbers

Off-topic, but I like this neural-network based compression that "compresses" images by changing pictures of shoes into pictures of different shoes.

The version of viridis that's in matplotlib does use 256 levels. That wasn't based on any real theory beyond "eh, that's probably big enough" -- if you have an example of where that causes visible striations I'd be very interested to see it!

Intuitively, though, I'd think that an image with 8 bits per channel is really going to struggle to represent more than roughly 8 bits of information in a colormap. The obvious baseline is that a "greys" colormap can only have 256 distinct levels, because that's how many distinct greys there are in 8-bit RGB. By drawing a curve you can potentially cover a bit more volume, and you can pick up a bit more resolution by sharing the low-order bits across channels (like a "greys" colormap that instead of going #000000#010101 goes #000000#000001#000101#010101), but the format is very limited.

I guess you could use something like OpenEXR, they have some image compression algorithms designed for high bit-depth images and minimal artifacts.

I can't find what I was doing when I saw it years ago, but it's visible if you do things like this:

x = np.arange(0, 1, 0.001)
y = np.arange(0, 1, 0.002)
xx, yy = np.meshgrid(x, y, sparse=True)
z = -np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
print(shape(z))

imshow(z)
tight_layout()
viridis()

small viridis

Ah, here's an interesting thread on the challenges of lossy compression: https://twitter.com/astrotweeps/status/883475694633705472