rdms.rank_transform treats nans as data
caiw opened this issue · 3 comments
When an RDMs.dissimilarities
contains nan
values, calling rdms.rank_transform
on it treats the nan
s as data, and (by default when using rank_transform(method='average')
) assigns them sequential integer ranks.
A better option for me (and perhaps the least surprising) would be if nan
values were ignored (i.e. kept as nan
) when running rank_transform
.
I often pass an RDM through rank_transform
when visualising it, like show_rdms(rank_transform(rdms))
, and this causes weird colours when an RDM contains a lot of nans
(especially when entire rows/columns are missing).
The ability to change how nan
s are handled by scipy.stats.rankdata
was changed in scipy v1.1.0, with the new nan_policy
argument, which to achieve the desired solution above should be set to omit
.
Agreed, let's fix this