intake/intake-xarray

intake_xarray and xarray.open_mfdataset have different default arguments

ghislainp opened this issue · 7 comments

I spent a lot of time figuring out why my code with open_mfdataset was working while the intake version was not.
The problem is that Intake_xarray (netcdf driver) sets some default arguments that are different from the defaults of open_mfdataset:

  • concat_dim='concat_dim'.
  • combined='nested' in the master branch (not in the latest release)

Since in my case the defaults of open_mfdataset work perfectly (I need no option), I had to 'revert' to these defaults in the catalog:

args:
  concat_dim: _not_supplied
  xarray_kwargs:
    combine: by_coords

The problem is that intake_xarray does not document this difference of behavior.

  • the expected behavior would be that intake_xarray works like xarray.open_mfdataset

Another problem is that "_not_supplied" is a internal to xarray and I'm not sure it is a value we are supposed to rely on. It is not documented in the xarray.open_mfdataset documentation as a valid value for concat_dim. It just appears in the args line.

My suggestion is to remove any defaults in intake_xarray... but I don't know the history and the side effects. I'm new to intake.

I believe I agree with you, would be good to use the same defaults. Would you like to implement this?

I could do it, it's mostly about removing code.
But I'm new to intake, I don't know why these defaults were set. It's also going to break compatibility for many people(?)

I don't know or can't remember. @jsignell ?

I don't think there is any reason why we diverge from the defaults in open_mfdataset. If you are open to writing the PR @ghislainp then I think we can just make the change and see if anyone has issues.

@ghislainp , are you working on this?

closed by #69