bedapub/besca

bc.pl.update_qualitative_palette is not working proprely

ajulienla opened this issue · 2 comments

The combination dict is not taken in properly. Colors/labels are not matching as they should

ECM:

test = bc.datasets.Kotliarov2020_processed()
sc.pl.umap( test, color = ['celltype2'] )
import seaborn as sns
uniqueValues = (test.obs.get( "celltype2").unique().tolist())
yy = sns.color_palette("tab10", len(uniqueValues), as_cmap= False).as_hex()

## SHOULD BE ONE AND THEN THE REVERSE SCALE

color_palette = {i:j for i, j in zip(uniqueValues, yy)}
color_palette2 = {i:j for i, j in zip(uniqueValues[::-1], yy)}

bc.pl.update_qualitative_palette(test, color_palette, "celltype2" )
sc.pl.umap( test, color = ['celltype2'] )
bc.pl.update_qualitative_palette(test, color_palette2, "celltype2" )
sc.pl.umap( test, color = ['celltype2'] )

## CHECKING COLORS VSALUES SHOWS THAT THE KEY -VALUE RELATHION SHIP IS NOT PRESERVED
print(color_palette)
print(color_palette2)

Waiting for scanpy update as it seems that there has been some changes there too.

The root cause of this is that the mapping between categories and colors is not read from a single dictionary in scanpy. The categories (e.g. 'B cell') are read from the AnnData object (pd.Categorical(adata.obs["celltype2"]).categories) and the colors can be defined in a separate list (adata.uns[group + "_colors"]). Although you can change the colors in the AnnData object, the order of the of the categories stay the same. For details have a look here:
scanpy/plotting/_tools/scatterplots.py: _get_palette (crruently line 1175): Here is the mapping done between category and color: dict(zip(values.categories, adata.uns[color_key])) with values = pd.Categorical(adata.obs[values_key]). I will re-write the besca/pl/_update_palette.py function to be able to accept an input. For testing this I replaced the following line:

uniqueValues = test.obs.get("celltype2").unique().tolist()

with

uniqueValues = pd.Categorical(test.obs["celltype2"]).categories.tolist()

This ensures that we use the same mechanism as scanpy uses to determine the categories and preserving the order