Avoid loading files from urls and s3/gcsfs

Question

Avoid loading files from urls and s3/gcsfs

dcherian opened this issue 3 years ago · 2 comments

Instead add datasets to xr.tutorial.open_dataset (https://github.com/pydata/xarray-data/). That way we don't have to worry about broken links.

NOAA ERSST in computation notebooks: http://psl.noaa.gov/thredds/dodsC/Datasets/noaa.ersst.v5/sst.mnmean.nc
Mask datasets in Computation: http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NODC/.WOA09/.Masks/.basin/dods
temperature gradient dataset in plotting notebooks

ds = xr.tutorial.open_dataset("air_temperature.nc").rename({"air": "Tair"})

# we will add a gradient field with appropriate attributes
ds["dTdx"] = ds.Tair.differentiate("lon") / 110e3 / np.cos(ds.lat * np.pi / 180)
ds["dTdy"] = ds.Tair.differentiate("lat") / 105e3
ds.dTdx.attrs = {"long_name": "$∂T/∂x$", "units": "°C/m"}
ds.dTdy.attrs = {"long_name": "$∂T/∂y$", "units": "°C/m"}

Answer 1 · 2022-06-22T19:53:31.000Z

@dcherian thoughts on possibly adding a zarr dataset to xarray-data? i played around with this a while back here https://github.com/scottyhq/zarrdata . That could illustrate using fsspec still and should be pretty reliable...

Answer 2 · 2022-06-22T20:04:20.000Z

Zarr + fsspec should be a great demo. But I think that would be a more intermediate-level tutorial.

For the "fundamentals" notebooks I think it'd be nice to just do tutorial.load_dataset to minimize confusion. It also uses pooch for caching, so that's nice.