Notes - how to change to class format
sfoks opened this issue · 0 comments
sfoks commented
concepts prior to notebooks that should be introduced:
-rechunk, compute near data concept, pangeo concepts, holoviews/holoviz?, other viz tools (hydroinspector), check with Rich on slides and content.
01_data_prep.ipynb (de-emphasize this one to not be confusing).
- @gzt5142 requesterpays issues maybe reading from nhgf bucket? @sfoks was having problems with this.
- @sfoks : not sure why there are roughly 350 gage IDs in the modeled dataset with letters embedded in the string of digits after the 'USGS-'... (These will be rejected by the API when we try to call NWIS data service to obtain streamflow history for that location).
- @sfoks to determine if students can we write to pangeo.chs.usgs.gov? or should students have access to open storage network to write to? @sfoks to ask @amsnyder @rsignell-usgs
- assign HPC inserts, explain briefly "how things would change if on HPC" (can be in comments).
- @sfoks adding nhm-prms streamflow to intake and access on cloud, double check with @amsnyder @rsignell-usgs
Lab options:
- Change first 100 gages to 200 gages
- Only look at gages west of 100th meridian
- Subset the gages for their state or favorite huc -- or deeper homework (?)
- Make sure to change number of gages back to 100 for fast export of file.
- In step 6, examine obs data, plot over CONUS, what are the number of gages, xarray dataset show-off.
Add exploratory notebook
- option: use the same dataset that you pull in data_prep 1
- option: use the pre-organized data from intake, talk about intake, how to change intake datasets.
- data structure, chunk size, time series data vs spatial analysis (rich has a good slide on this, take some language from this https://github.com/hytest-org/hytest/blob/main/dataset_preprocessing/ReChunkingData.ipynb , rich's slide from latest rechunk overview, that 2013 presentation).
- map; point & explore.
- explore hydrographs, compare obs vs mod raw data 1:1 plots, stations, plot station locations,
- xarray show-off
- baby overview of dask (just how to read data, point to dask tutorial if people want more)
- resample by time show-off?
Lab options:
- create map with hvplot of gage locations, hover to a gage, then plot hydrograph for that gage?
- change gage location, change time period of analysis in hydrograph?
- change modeling application?
analysis_stdsuite.ipynb
- note during lecture: reacquaint everyone to what the data are and where they live.
- can we use %% R magic in Jupyter?
# let's activate R magig
%load_ext rpy2.ipython
- note during lecture: make a point in teaching of closing down clusters, only starting when needed.
Lab options:
- add two functions: AvgKGE, MedKGE functions - could be a canned example/homework assignment at end of notebook; hidden cells.
- add a statistic to the standard suite, like some GOF function, rerun on all sites.
viz notebook
- note during lecture: to emphasize connection between csv generated in the analysis notebooks we just exported to the input of the viz notebook.
- notebook refilters to cobalt gages (keep in case folks need this again, also possible option to further filter gages).
- potentially add scorecard from usage example to viz notebook?
Canned example/demo:
- (optional) select all gages with percent bias greater than 50% and less than -50% (visualize on map)
- (optional) select west of 100th meridian and visualize performance?
Lab options:
- change the colorbar in map? (low-priority), maybe just add link to holoviz or panels customizations?
- [potential] deeper homework/lab we could add:: attach gagesII attributes to sites from sciencebase, group gages by a new grouping (join gage to new dataset, do boxplots or something easy).
analysis_dscore.ipynb
- potentially add dscore scorecard (from usage notebook Gene put together).