fferflo/einx

[Feature Request] dask.array and xarray as backend.

gRox167 opened this issue · 3 comments

Required prerequisites

I have searched the Issue Tracker that this hasn't already been reported. (comment there if it has.)

Motivation

Thanks for all the brilliant work of contributors of this repo!
Dask.array is a distributed version of numpy which could let research easily do parallel computing with cpu.
Xarray is a named array package that is widely used in a lot of scientific area, and xarray also use dask as backend to support parallel computing.

If einx can support dask.array or xarray it would be super convenient to do large-scale (especially those who could not fit into memory) distributed data processing. This could be useful for areas like physics, astronomy, geoscience, microscopy and medical image.

Solution

For dask.array, it would be easy. We can just utilize its lazy api, it is pretty much the same with numpy api.
However for xarray it is a little bit tricky, as xarray have names binding to each axis. We need to be careful when treating named dimension and the name in Einstein expression.

Alternatives

We can refer to xarray-einstats.

Hey, thanks for the suggestion!

It looks like most backend functions that einx requires are implemented for dask arrays, although as far as I can tell there is no vmap or other option to vectorize functions over arbitrary axes (which is required by einx.vmap and einx functions that rely on it such as einx.get_at). I'll look into adding support for it.

I'm unsure how einx would meaningfully interface with tensor backends that have named axes though (torchdim would be another example) since they follow different philosophies for how axes should be handled. For example:

# With named axes:
x.sum("time") # other axes are implicit

# With einx
einx.sum("b [time] c", x) # other axes are explicit and ordered

I added support for dask.array in 5fc0c09. Let me know if anything isn't working as expected :)

Thanks for all the work you have done! I will update and have a check!