NREL/sup3r

Initial sup3r data pipeline

grantbuster opened this issue · 8 comments

Rough steps:

  1. Check out the WTK data on Eagle:
    a. /datasets/WIND/conus/v1.0.0/wtk_conus_2007.h5 (hourly data)
    b. /datasets/WIND/conus/v1.0.0/2007/wtk_conus_2007_2m.h5 (5min data)
  2. Start with rex to get the indices associated with a spatial raster:
    a. https://github.com/NREL/rex/blob/4edb1cd42ef13f3d93029e8a6c280bd50ae801e0/rex/resource_extraction/resource_extraction.py#L1254
  3. Check out the super training API:
    a.
    def train(self, x, y, n_batch=None, batch_size=128, n_epoch=100,
  4. Make a data pipeline/handler that delivers the high res data along with coarsened low res data to super (average spatial and sample temporal)
  5. Start with ~40k hourly timesteps with a 1000km x 1000km raster. WTK is about 2km so for the fine res dataset this would be a 500x500 raster over 40k hours (multiple years or multiple spatial locations). ~50x spatial enhancement so coarse will be averaged down to a 10x10 raster. 
    

Considerations:

  1. Windspeed and direction must be -> u and v
  2. Direction is cardinal (north) but WRF u and v are grid-orthogonal
    a. https://github.com/NREL/wtk/blob/13991a4dc57a2d06eac932be2615627893ad5769/wtk/hrrr.py#L246
  3. We should start with just u and v as “channels” but definitely code with the anticipation of new channels (more hub heights, topography, temp, pressure)
  4. Big mem constraints – consider batching training batches using py generator and not duplicating all of the data
    a. https://github.com/NREL/phygnn/blob/2028e6cae5c5bf1610e858cf656d90d68496cba1/phygnn/base.py#L367
  5. Consider that we’re going to make “base” models using the WTK h5 source datasets, but will be transfer learning on native 2D WRF NetCDF files
  6. Use float32
  7. Do spatial only for now, consider temporal dimension
  8. Multi year data handler:
    a. https://github.com/NREL/rex/blob/4edb1cd42ef13f3d93029e8a6c280bd50ae801e0/rex/multi_year_resource.py#L286

Charge code:
WFED 11556 03.01.03

Timeline goal:
Have a GAN trained using the new sup3r infrastructure and data pipeline by Feb 7th

bnb32 commented

What units does the wind toolkit give the wind direction in? wtk seems to output it in degrees but experiments with data give me numbers above 20k.

scaled precision perhaps? Use the rex resource handlers.

bnb32 commented

Ah yeah that info is in ResourceX.attrs. Missed that.

If you use the rex resource handlers they should auto-scale/unscale the data.

bnb32 commented

ah ok, instead of going through ResourceX use ResourceX._res and open_dataset. Lots to navigate here.

bnb32 commented

lol. And I'm over here trying to reinvent the wheel.

Yeah if you think we've done it before we probably have, just ask where to find the tools :)