xarray-contrib/xarray-simlab

DataArray is converted to ndarray when passed as input variable

ethho opened this issue · 2 comments

ethho commented

Given the process definitions below:

@xs.process
class ConsumeDA:
    da = xs.variable(dims='x', intent='inout')

    def initialize(self):
        print(f"Is a DataArray: {isinstance(self.da, xr.DataArray)}")
        print(type(self.da))
        
@xs.process
class InitDA:
    da = xs.foreign(ConsumeDA, 'da', intent='out')
    
    def initialize(self):
        self.da = xr.DataArray(data=range(5), dims='x')

We see expected behavior when instantiating an xr.DataArray within an upstream process:

model0 = xs.Model(dict(
    init=InitDA,
    use=ConsumeDA
))

xs.create_setup(
    model=model0,
    clocks=dict(step=range(2)),
    input_vars=dict(),
    output_vars=dict()
).xsimlab.run(model=model0)
# Is a DataArray: True
# <class 'xarray.core.dataarray.DataArray'>

However, when the same variable is passed as an input, it is converted to a numpy.ndarray:

model1 = xs.Model(dict(
    use=ConsumeDA
))

xs.create_setup(
    model=model1,
    clocks=dict(step=range(2)),
    input_vars=dict(
        use__da=xr.DataArray(data=range(5), dims='x')
    ),
    output_vars=dict()
).xsimlab.run(model=model1)
# Is a DataArray: False
# <class 'numpy.ndarray'>

Is this expected behavior? If so, what is best practice for passing DataArrays as model inputs?

Environment

  • Python 3.8.3
  • numpy==1.19.2
  • scipy==1.5.3
  • xarray==0.16.1
  • xarray-simlab==0.4.1
  • zarr==2.5.0
  • attrs==19.3.0

Yes it's expected behavior. Currently Xarray objects are used only for the model I/O interface.

I agree that using xarray DataArray or Variable inside process classes would be useful, though. I suggest that in #141.

ethho commented

Thanks for confirming, @benbovy.

For posterity: I worked around this by writing upstream "converter" processes that ingest np.ndarray or numbers.Number and output DataArrays.