r-spatial/discuss

Harmonise standard coloring in various plot methods

Opened this issue ยท 13 comments

At the moment we get very different outputs when plotting spatial data (vector, raster, categorical or continuous) depending on which package/method is used. I wonder if we should put some effort into harmonising the standard colors used for the different methods? In my opinion, results of the different ways to visualise spatial data should produce rather similar results so that users can focus on interpreting their data rather than having to make sense of the differences in the color mapping.

Given that there are some more questions to address beyond plain coloring, e.g. how the mapping is done (see e.g. here) it may make sense to think about a dedicated package for this.

Here is a small script comparing some of the major methods.

library(sf)
library(sp)
library(tmap)
library(mapview)
library(raster)

#' ###########################################################################
#' ## 1. vector data
#' ###########################################################################
#'
#' ### 1.1. categorical
#'
feat_cat = franconia[, "district"]
feat_cat$district = as.factor(feat_cat$district)
#'
#' #### 1.1.1. sf - plot
plot(feat_cat)

#'
#' #### 1.1.2. sp - spplot
spplot(as(feat_cat, "Spatial"), zcol = "district")

#'
#' #### 1.1.3. tmap - qtm
qtm(feat_cat, fill = "district")

#'
#' #### 1.1.4. mapview - mapview
mapview(feat_cat, zcol = "district", legend = TRUE)


#'
#' ### 1.2. continous
#'
feat_con = franconia[, "SHAPE_AREA"]

#' #### 1.2.1. sf - plot
plot(feat_con)

#'
#' #### 1.2.2. sp - spplot
spplot(as(feat_con, "Spatial"), zcol = "SHAPE_AREA")

#'
#' #### 1.2.3. tmap - qtm
qtm(feat_con, fill = "SHAPE_AREA")

#'
#' #### 1.2.4. mapview - mapview
mapview(feat_con, zcol = "SHAPE_AREA")



#' ###########################################################################
#' ## 2. raster data
#' ###########################################################################
#'
rst = poppendorf[[4]]

#'
#' ### 2.1. raster - plot/spplot
plot(rst)
spplot(rst)

#'
#' ### 2.2. tmap - qtm
qtm(rst)

#'
#' ### 2.3. mapview - mapview
mapview(rst)

#'
#' ### 2.4. stars - ???

Pinging @edzer @rsbivand @mtennekes @mdsumner but feedback from everyone interested is highly encouraged.

I'd still love to get input here, as I think this can be beneficial for a lot of users. Having the same sort of feel/look for map creation independent of the rendering mechanism would ensure a certain level of consistency and could be a first step towards #13

A natural candidate package for stremlining colors would IMHO be colorspace. It has a solid foundation, is actively maintained and provides various choices for all relevant color scales needed for spatial data vis. It would just be a matter of agreeing upon the several palettes for different data types.

As I said, input is very welcome...

I like the idea. If you as the r-spatial developers agree on a unifying interface for your packages, we can also adapt colorspace to provide a suitable interface. Just like we do no for ggplot2 scales.

Maybe thinking about palettes as a function(n) is already sufficient - as discussed with @tim-salabim on Twitter: https://twitter.com/TimSalabim3/status/1104075964651315200

But possibly some packages/functions require more granular information or customization? For example meta-information about the kind of palette (qualitative, sequential, diverging, ...) would be relevant or whether it is intended for white vs. dark background etc.

A recent fruitful twitter conversation with the colorspace maintainer Achim Zeileis outlines their willingness to help us with enabling usage of colorspace for streamlining palette use for spatial data plotting independent of the rendering engine.

I'd really like to see some sort of streamlining on this topic, as I think it is really helpful for users to expect similar outputs for the same data being plotted on different plotting engines.

Nice discussion.

I think our aim should not be to harmonize the defaults of color palettes, because this is difficult since the choice of color palette is influenced by many factors, including the type of plot/map and the taste of the package authors. For instance, I like the qualitative HCL palettes for ggplot plots, but somehow not for maps. For tmap, I used colorbrewer, which is, I think, a robust choice, but others may disagree. Actually, I don't mind when packages use different default palettes. I mean, I immediately recognize a ggplot plot by the used HCL colors.

What we definitely could, or maybe ever should do is to standardize the way to obtain palettes, and to make all palettes easily available for all viz packages (even though they have different defaults).

Some specific thinks that could be worthwhile to share:

  • The palettes that are currently used should be available in this color package. For instance, tmap and many other package use palettes from colorbrewer and viridis.
  • I agree with @statibk that meta-info should be included. Also whether it is color-blind friedly.
  • For tmap I have create this tool tmaptools::palette_explorer(). Feel free to take ideas from it, when we create a new color-picking tool. I like to tool colorspace::choose_palette() (especially the shiny mode), although it may be a bit too technical for plain users. So in order words, whether we use colorspace or not, we should improve the user interaction such that both beginners and experts can use it.
  • Request: a diverging color palette in which the middle color is not (too) bright. For choropleths, missing values are usually plotted light grey, which is difficult to distinguish from the grey middle color that diverging palettes often have. Some colorbrewer palettes have yellow, but it would be nice to have other options as well.
  • For treemap, I used a special tree-structure palette based on the HCL colorspace (see https://mtennekes.github.io/downloads/publications/TreeColors_manuscript.pdf), which I still need to extract from treemap. Maybe we could include this palette as well?

Thanks @mtennekes for your thoughts!

If I understand you correctly you are saying that we should

  1. not "force" default palettes for mapping packages/functions
  2. find/agree upon a color package that provides the complete set of different defaults chosen

I think this is a fair approach. What I don't understand completely is whether you suggest that e should aim to create such a color package? If so, I'd have my reservations as there are (too) many color(palette) packages already.

Request: a diverging color palette in which the middle color is not (too) bright. For choropleths, missing values are usually plotted light grey, which is difficult to distinguish from the grey middle color that diverging palettes often have. Some colorbrewer palettes have yellow, but it would be nice to have other options as well.

Have you seen palettes Berlin, Lisbon & Tofino in colorspace?

I agree with @statibk that meta-info should be included. Also whether it is color-blind friedly.

I agree completely!

Considering your thoughts, I am more than ever convinced that colorspace is a very good foundation to build upon. It even has a similar scheme to the default sp/sf colors colorspace:::bpy(10).

edzer commented

Agreeing on color would be one ambition; an easier to reach ambition might be to harmonize how we can manipulate color choices, e.g. how to specify

  • a set of colors for a set of features (e.g. as with the base plot col argument)
  • a color palette (sf has a pal argument which accepts a palette function, or a set of colors)
  • the number of color breaks
  • the strategy to define color breaks (regular, quantile etc.)
  • the parameters for a color legend (position, sizes etc)

Or is this really a new issue?

  • a set of colors for a set of features (e.g. as with the base plot col argument)
  • a color palette (sf has a pal argument which accepts a palette function, or a set of colors)

Are you suggesting we should have both arguments col & pal? I think this could be nice from a users point of view.

  • the number of color breaks
  • the strategy to define color breaks (regular, quantile etc.)

I assume most packages are using classInt to define breaks? mapview currently uses lattice::do.breaks but I have been wanting to have a look at classInt for a while now.

  • the parameters for a color legend (position, sizes etc)

There may be issues with what is possible in the upstream packages. leaflet only allows one of the four corners, whereas ggplot usually has the legend on the right to the plot.

Or is this really a new issue?

I don't think it's new, but I think we can do a bit better to help lower user confusion...

Just my quick thought - I agree with @edzer - there are two issues: one with a standard color palette and one with function arguments.
Regarding the first one, I do not know what the best approach is, but I am impressed with the colorspace package, and I think it could be useful as a color foundation in mapping packages.
Regarding the second one, we could start with creating a table (?) with mapping functions (columns) and their arguments related to colors (rows). This could be useful to see how they differ and how to minimize their differences...

Heads-up: https://developer.r-project.org/Blog/public/2019/04/01/hcl-based-color-palettes-in-grdevices/index.html may very well help, as grDevices::hcl.colors() from R 3.6 will provide most of what is needed without using extra packages.

Yes, that is great news indeed!

FWIW, I now use palr::image_raster() and palr::image_stars() to "bake-in" a colour mapping as an RGB expanded object, so I can use plotRGB() or plot.stars() in the normal ways and it's the same for export to file, it allows the standard image(cols, breaks) thing which covers most image cases afaics. It's also directly useable in ggplot2::annotation_raster by conversion to array in [0,1].

Nice!