spark-root/laurelin

TTreeColumnVector should hold one ArrayBuilder

jpivarski opened this issue · 3 comments

and just call .build(...) in the getFloats(...) etc. methods.

Hmm -- what about for multple leaves in a branch? Each leaf will be exposed as a separate TTreeColumnVector, so do we perhaps need some way to pass in the builder in the constructor so they're shared?

A non-complicated way to do it would be to say that a branch with multiple leaves has Spark's Struct type. One branch == one Spark column, and if the branch has multiple leaves, then the column is a struct.

When the data source supports Prunable, you're promising that one column can be read independently of the others. Leaves of a multi-leaf branch cannot be read independently of each other, so saying one branch == one Spark column is truth in advertising.

I did this some time ago.