API to style categorical variables
Closed this issue · 10 comments
It is often useful to style columns having a pd.CategoricalDtype
, by either:
- being supplied a mapping from category name/code to background colour.
- being supplied a mapping from category name/code to Styler.
- selecting
len(dtype.categories)
colours from a matplotlib colormap (qualitative or otherwise).
So for example we might allow for:
StyleFrame(df).add_categorical_conditional_formatting(style_map={"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")})
or
StyleFrame(df).add_categorical_conditional_formatting(style_bg_cmap="Pastel1"})
which would perform the following for each column
cmap = matplotlib.cm.get_cmap(style_bg_cmap)
col_style_map = dict(zip(col.dtype.categories, cmap(len(col.dtype.categories))))
Conditional formatting would be applied using a CellIsRule("=", category)
for each category value in each column in range.
Can this not be achieved with the already available methods (albeit in a more explicit way)?
How would you recommend styling categorical variables with the current API, @DeepSpace2?
Are you able to provide an example pd.DataFrame
and the output sheet that you envision
StyleFrame(df).add_categorical_conditional_formatting(style_map={"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")})
will output?
Let's say that
style_map = {"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")}
sf.add_categorical_conditional_formatting(style_map=style_map)
translates roughly to:
sf = StyleFrame(df)
for value, styler in style_map.items():
sf = sf.apply_style_by_indexes(sf == value, styler)
Ideally, however, it would be implemented with conditional formatting so that the styles do not need to be stored on each cell.
And ideally, provisions would be made to use the categorical dtype's cardinality to select from a colour palette...
Ah, I see.
The current API does support conditional formatting, but only using ColorScaleRule
(wrapped by StyleFrame.add_color_scale_conditional_formatting
).
However, the above example can be implemented without it, using quite an explicit code:
style_map = {"a": Styler(font_color="red"), "b": Styler(font_color="green"), "c": Styler(font_color="blue")}
sf = StyleFrame({'1': ['a', 'b'],
'2': ['c', 'd']})
for value, styler in style_map.items():
for col in sf.columns:
sf.apply_style_by_indexes(sf[sf[col] == value], styler, cols_to_style=col)
generating
I agree it's not hard to do; the point is to support:
- increased visibility of this styling approach for discrete-valued columns.
- using a matplotlib colormap or other arbitrary palette where the user is ambivalent about the specific colors used.
This issue has been automatically marked as stale because it has not had activity in the last 60 days.