Avoid loading files from urls and s3/gcsfs
dcherian opened this issue · 2 comments
dcherian commented
Instead add datasets to xr.tutorial.open_dataset
(https://github.com/pydata/xarray-data/). That way we don't have to worry about broken links.
- NOAA ERSST in computation notebooks: http://psl.noaa.gov/thredds/dodsC/Datasets/noaa.ersst.v5/sst.mnmean.nc
- Mask datasets in Computation: http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NODC/.WOA09/.Masks/.basin/dods
- temperature gradient dataset in plotting notebooks
ds = xr.tutorial.open_dataset("air_temperature.nc").rename({"air": "Tair"})
# we will add a gradient field with appropriate attributes
ds["dTdx"] = ds.Tair.differentiate("lon") / 110e3 / np.cos(ds.lat * np.pi / 180)
ds["dTdy"] = ds.Tair.differentiate("lat") / 105e3
ds.dTdx.attrs = {"long_name": "$∂T/∂x$", "units": "°C/m"}
ds.dTdy.attrs = {"long_name": "$∂T/∂y$", "units": "°C/m"}
scottyhq commented
@dcherian thoughts on possibly adding a zarr dataset to xarray-data? i played around with this a while back here https://github.com/scottyhq/zarrdata . That could illustrate using fsspec still and should be pretty reliable...
dcherian commented
Zarr + fsspec should be a great demo. But I think that would be a more intermediate-level tutorial.
For the "fundamentals" notebooks I think it'd be nice to just do tutorial.load_dataset
to minimize confusion. It also uses pooch
for caching, so that's nice.