DeepSpace2/StyleFrame

API to style categorical variables

Closed this issue · 10 comments

It is often useful to style columns having a pd.CategoricalDtype, by either:

  • being supplied a mapping from category name/code to background colour.
  • being supplied a mapping from category name/code to Styler.
  • selecting len(dtype.categories) colours from a matplotlib colormap (qualitative or otherwise).

So for example we might allow for:

StyleFrame(df).add_categorical_conditional_formatting(style_map={"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")})

or

StyleFrame(df).add_categorical_conditional_formatting(style_bg_cmap="Pastel1"})

which would perform the following for each column

cmap = matplotlib.cm.get_cmap(style_bg_cmap)
col_style_map = dict(zip(col.dtype.categories, cmap(len(col.dtype.categories))))

Conditional formatting would be applied using a CellIsRule("=", category) for each category value in each column in range.

Can this not be achieved with the already available methods (albeit in a more explicit way)?

How would you recommend styling categorical variables with the current API, @DeepSpace2?

Are you able to provide an example pd.DataFrame and the output sheet that you envision

StyleFrame(df).add_categorical_conditional_formatting(style_map={"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")})

will output?

Let's say that

style_map = {"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")}
sf.add_categorical_conditional_formatting(style_map=style_map)

translates roughly to:

sf = StyleFrame(df)
for value, styler in style_map.items():
    sf = sf.apply_style_by_indexes(sf == value, styler)

Ideally, however, it would be implemented with conditional formatting so that the styles do not need to be stored on each cell.

And ideally, provisions would be made to use the categorical dtype's cardinality to select from a colour palette...

Ah, I see.

The current API does support conditional formatting, but only using ColorScaleRule (wrapped by StyleFrame.add_color_scale_conditional_formatting).

However, the above example can be implemented without it, using quite an explicit code:

style_map = {"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")}
sf = StyleFrame({'1': ['a', 'b'],
                 '2': ['c', 'd']})
for value, styler in style_map.items():
    for col in sf.columns:
        sf.apply_style_by_indexes(sf[sf[col] == value], styler, cols_to_style=col)

generating

image

I agree it's not hard to do; the point is to support:

  • increased visibility of this styling approach for discrete-valued columns.
  • using a matplotlib colormap or other arbitrary palette where the user is ambivalent about the specific colors used.
stale commented

This issue has been automatically marked as stale because it has not had activity in the last 60 days.