hsplit, vsplit inconsistent with numpy?
Closed this issue · 3 comments
Description
Hey! While going over scverse/scanpy-tutorials#97 I noticed a couple things and thought I would follow them up here.
Marsilea's definition of hsplit and vsplit seem inconsistent with what I'd expect coming from numpy
. They seem to actually act along opposite axes
Example
import numpy as np
import marsilea as mars
X = np.arange(12).reshape(4, 3)
display(X)
mars.Heatmap(X).render()
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
display(np.hsplit(X, [2]))
m = mars.Heatmap(X)
m.hsplit([2], spacing=0.1)
m.render()
[array([[ 0, 1],
[ 3, 4],
[ 6, 7],
[ 9, 10]]),
array([[ 2],
[ 5],
[ 8],
[11]])]
Suggestion
Personally, I think the numpy versions make more sense. However, I also mess this up frequently. Since Marsilea only needs to deal with two dimensional grids I would suggest moving to:
group_rows
/group_columns
, since this is quite like a group by operation without the aggregatesplit_rows
andsplit_columns
(a bit likeComplexHeatmap
)
Yes, it's different. I want to make it more visually intuitive for the API name at first, but it looks like hsplit
/vsplit
may confuse others.
Thanks for the suggestions, I think it's a good idea to divide the current split
into group_*
and split_*.
But Marsilea can split non-matrix plots like barplot or violin plot, so the endings in rows
and columns
may be confusing in these cases.
What would the difference between group_*
and split_*
be to you? My preference at the moment would be if there was only a group
, especially since you can pass essentially the same argument here as you would pass to DataFrame.groupby
.
To clarify, previously I was suggesting either split
or group
. I'm not sure I like "both" since they basically do the same thing, and it's nicer if there's only one way to do it.
But Marsilea can split non-matrix plots like barplot or violin plot, so the endings in rows and columns may be confusing in these cases.
I think rows
and columns
still makes sense for those plots. I believe you still end up with rows and columns, it's just that each entry can be a violin. Addmitedly, I don't think grouping by columns makes sense in a plot like this bar chart: https://marsilea.readthedocs.io/en/stable/auto_examples/plot_oil_well.html#sphx-glr-auto-examples-plot-oil-well-py
The signature for hsplit
/vsplit
is hsplit(cut=None, labels=None, order=None, spacing=0.01)
. Users can either specify cut
to cut the plot using the index of data or specify labels
to group the plot. split_*
can be used to handle the cut
parameter and group_*
can be used for labels
parameter.
rows
and columns
are indeed clearer than horizontal
or vertical
.