scikit-hep/root_pandas

Provide flattening option to flat into columns

maxnoe opened this issue · 5 comments

In our use case of root files, we need it more often, that an array is flattened into columns, not rows.

E. g. we often save coordinates as array into the root files. For these it makes more sense to flatten into columns:
cog[3]cog_0, cog_1, cog_2

ibab commented

Can't this already be done by using the noexpand: prefix and accessing the elements separately, like this:

columns = [
  'noexpand:cog[0]',
  'noexpand:cog[1]',
  'noexpand:cog[2]',
]
data = read_root('data.root', columns=columns)

This also seems more flexible, because you can choose which elements you'd like to keep.

For me this solution only works with :
columns = [
'noexpand:cog[0]',
'noexpand:cog[1]',
'noexpand:cog[2]',
]
data = read_root('data.root', columns=columns,flatten=True)

and I have a useless __array_index column which is always = 0.

If I don't use flatten I get the warning
UserWarning: Ignored the following non-scalar branches: cog[0], cog[1]

@maxnoe, OK if we close this task at this point? It's been almost 2 years … maybe you are aware that this package has effectively been superseded by https://github.com/scikit-hep/uproot?

Yes I am ;)

Thanks a lot for the speedy reply, @maxnoe.