apache/arrow-julia

Support tensors/sparse tensor messages

quinnj opened this issue · 10 comments

Support tensors/sparse tensor messages

Are there any plans to support ND formats here or some way of formally writing custom metadata that supports reading a vector of data into an array structure?

writing custom metadata that supports reading a vector of data into an array structure?

LegolasFlux does this: https://github.com/beacon-biosignals/LegolasFlux.jl/blob/aaa94fdb050e0b5333725d7d968b58b080b67df2/src/LegolasFlux.jl#L16

I’m not really sure the practical difference between something like that and the tensor bit of the arrow spec (https://github.com/apache/arrow/blob/master/format/Tensor.fbs).

Thanks for the example. Perhaps it's a more effective use of developer time to heavily document how to write custom metadata? I'm trying to find ways to sell some other communities on this format instead of making more new file formats (an abominable practice that plagues neuroscience).

I guess strides and better Interop would be the main differences? I imagine native tensor support would be nice for RPC as well.

I think formalizing size and strides would be a great thing to have in place.
I imagine it would be best to write most other things as small dictionaries so that it can be read universally and then provide specific methods in a package for type stable reading into domain specific data structures.

Has there been any update on this?

I would like to have a go at supporting tensors in Arrow.jl.

Any status about this ?

I don't think anyone is working on it

I volunteered to work on it. Thanks for the reminder, I will start working once I get some uni problems sorted (1-2 weeks).