PyPSA/linopy

Multiple coordinates for one dimension

Opened this issue · 2 comments

Hi people, I am currently struggling with xarray and linopy.

I have a dimension A whose coordinates can be grouped. For me it was natural to have a dimension A with coordinates for A but also coordinate B that is built on dimension A and then swap dimensions but I failed to do it.

So here is an example.
dimension A has coordinates: [1, 2, 3, 4, 5, 6] but coordinates [1, 2, 3] belong to group 311 and rest of them to group 322. I would like to create constraints based on those 2 groups.

I have a variable with this situation:

var.coords
Coordinates:
   * A (A) int64 48B 1 2 3 4 5 6
     B (A) int64 48B 311 311 311 322 322 322

Looking at xarray docs this may be the way var.swap_dims({'A': 'B'}) and then I can do sum on other dimensions and set constraint for groups. But Variable doesn't have that method.

Next I tried with groupby on B, I got correct groups but then sum on groupby expression will throw an error that B is already present...

In the end I made B full dimension (I am breaking normalization from Databases theory :D) but looks like suboptimal solution and I have this kind of case for multiple coordinates so instead of having small dimensional case of 7 dimensions I will end with at least 9

Could I do something better?

Hey @aurelije, thanks for raising the issue. How about we add the swap_dims method to the variable class. Would that help?

Hi @FabianHofmann I have tried 0.3.10 version and change works in creating expressions with swapping dimensions. But it fails in expressions where you do not use those "auxiliary coordinates" because linopy would try to align on them too in comparison expressions with other dataarrays and then fail because they are missing in left side.

Example:

model.add_constraints(var.sum(['A']) == params_da) 

Here I use dimension A that has coordinates, but there is an auxiliary coordinate B set on same dimension. Dataarray pamams_da has dimension A but doesn't have coordinate B. In this example I do not care about coordinate B, I do not use it in this constraint. But I get exception ValueError: coordinate 'B' not present in all datasets. Yes I could probably solve this by adding B coordinate into params_da but it beats the purpose of having elegant code.

The source of error is in expressions.py file in line: ds = xr.concat([d[["coeffs", "vars"]] for d in data], dim, **kwargs)

It looks to me that the biggest obstacle to using linopy, which has such a good idea behind, is actually xarray being too complex and clumsy. In comparison to clumsy pandas, in pandas there is a nice tutorial and good documentation with numerous examples on how to use methods and functions... Ok maybe building linopy on top of xarray is necessity, but I think that less xarray gets exposed to linopy user, the better for linopy user. Xarray should be hidden deep behind the api, from outside we should use only xarray for supplying parameters with named dimensions...