The code is written in Python, and some packages should be installed before the scripts can be executed smoothly.
- Package pandas is a fast, BSD-licensed library that provides high-performance data structures and data analysis tools.
- Package numpy is the fundamental package for scientific computing.
- Package pvlib provides functions for simulating the performance of PV systems.
- Package scipy provides algorithms for scientific computing.
- Package sklearn is the basic package for machine learning.
- Package hydrostats is a library of functions for time series analysis.
Other Python packages that used for plotting the results include seaborn, matplotlib, and plotnine.
A total of 7 ENS_XXX.csv, Jacumba_ENS.csv, McClear_Jacumba.csv, ECMWF_HRES.csv, and 60947.csv files are provided. These files contain four years (2017--2020) of the ECMWF ENS forecast data for seven SURFRAD stations (xxx denotes the three-letter station abbreviations), four years (2017--2020) of the ECMWF ENS forecasting data for Jacumba solar plant, clear-sky irradiance for Jacumba solar plant, ECMWF HRES forecast data, and Jacumba solar power data whose Energy Information Administration (EIA) plant ID is 60947.
A total of three Python scripts, namely, post-processing.py, GradientBoosting.py, and ModelChain.py are provided for reproducibility. The file names are self-explanatory. In that, the post-processing.py provide the operational post-processing of NWP-based solar forecast at seven research-grade ground-based stations; GradientBoosting.py reproduces the irradiance-to-power conversion approach using gradient boosting; and ModelChain.py provides the irradiance-to-power conversion approach using model chain. To use these scripts, the user only needs to change the working directory.