The finding of the optimal radius of influence makes assumption on the ordering of the dimensions in the longitude array
adybbroe opened this issue · 4 comments
Code Sample, a minimal, complete, and verifiable piece of code
The problem is well illustrated by reading and trying to remap the AWS tesdata:
>>> from satpy import Scene
>>> FILENAMES = ["/home/a000680/data/aws_testdata_from_nigel/W_XX-OHB-Stockholm,SAT,AWS1-MWR-1B-RAD_C_OHB_20230816120142_G_D_20240115111111_20240115125434_T_B____radsim.nc"]
>>> AREAID = 'eurol'
>>> scn = Scene(filenames=FILENAMES, reader='aws_l1b_nc')
>>> scn.load(['1'])
>>> local = scn.resample(AREAID)
>>> local.show('1')
Could not calculate source definition resolution
See the comments in the Satpy PR here: pytroll/satpy#2565
In the method to determine the optimal radius of influence to be used when reampping data (
pyresample/pyresample/geometry.py
Line 664 in 6a8afc0
The specific code lines that should be adapted in case that self.lons
is an xarray.DataArray
is this part I believe:
rows = self.shape[0]
start_row = rows // 2 # middle row
src = CRS('+proj=latlong +datum=WGS84')
if radius:
dst = CRS("+proj=cart +a={} +b={}".format(radius, radius))
else:
dst = CRS("+proj=cart +ellps={}".format(ellps))
# simply take the first two columns of the middle of the swath
lons = self.lons[start_row: start_row + 1, :2]
lats = self.lats[start_row: start_row + 1, :2]
I would propose something like this instead:
rows = self.lons['y'].shape[0]
start_row = rows // 2 # middle row
src = CRS('+proj=latlong +datum=WGS84')
if radius:
dst = CRS("+proj=cart +a={} +b={}".format(radius, radius))
else:
dst = CRS("+proj=cart +ellps={}".format(ellps))
# simply take the first two columns of the middle of the swath
lons = self.lons.sel(y=start_row)[:2]
lats = self.lats.sel(y=start_row)[:2]
Problem description
[this should also explain why the current behaviour is a problem and why the
expected output is a better solution.]
Expected Output
Actual Result, Traceback if applicable
Versions of Python, package at hand and relevant dependencies
As mentioned in your pull request we can't assume xarray is available in Pyresample. That is standard in Satpy, but not pyresample. If I'm understanding your PR correctly, your swath DataArrays have xarray dimensions ("x", "y")
? Is that correct? In that case, I have two possible solutions that I see:
- Update the geocentric resolution to take pixels in the middle row first two columns like it is now and the middle column first two rows and take the largest resolution size found.
- Update your reader in Satpy to be
("y", "x")
. While technically it shouldn't matter, I'm a little scared that this is only one piece of the Satpy/Pyresample puzzle that you're finding an issue with, but that we make the assumption of("y", "x")
in many places.
Ok, thanks for the comments and suggestions. And, yes, the dimensions are ("x", "y")
which is non-standard, I know (see below).
Concerning option 2: Yes, but actually in this case it is the data format which is "wrong", or non-standard. So, for now I rather keep the reader as it is, and wait for the agency (ESA) to fix the format. They have acknowledged the issue at least.
But, as the reader returns an xarray DataArray I thought it would be appropriate to actually try use that information, rather than make assumption on the data layout.
Having said that, I also like your option 1. So, would you prefer:
- Keep the suggestion above when the data is an xarray DataArray, and then improve the code for Numpy cases as you propose in your option 1? Or
- Skip using the xarray capability and improve as in your option 1, which will will work in all cases?
Concerning option 2: Yes, but actually in this case it is the data format which is "wrong", or non-standard. So, for now I rather keep the reader as it is, and wait for the agency (ESA) to fix the format. They have acknowledged the issue at least.
I don't agree with this, but I'm not doing the coding so I can accept it. Everyone is admitting it is "wrong" (or at least unexpected) and the solution should be a simple .T
on the array. Anyway...
Having said that, I also like your option 1. So, would you prefer:
- Keep the suggestion above when the data is an xarray DataArray, and then improve the code for Numpy cases as you propose in your option 1? Or
- Skip using the xarray capability and improve as in your option 1, which will will work in all cases?
I prefer 2. It (at least theoretically) should only improve the accuracy of this method which is at best a guess. I'd prefer to avoid Xarray workarounds as much as possible, especially when the alternative is "make this functionality work better", but I could also be missing some complexity in this task so I could be convinced otherwise.
I agree that option 2 is preferable, and that the reader should be adapted such that the order of dimensions is (y, x)
. This is the simplest solution that allows other parts of pytroll to stick to the assumption that dimensions have this order, without the need to complicate the code everywhere where that assumption is currently made.