Update OGE to work with the next PUDL release / 2021 data
grgmiller opened this issue · 2 comments
grgmiller commented
The latest release of PUDL includes many updates and potentially breaking changes that we will need to address. The changes I've flagged so far include:
- update the environment to be compatible with pudl
- Update code/warnings to allow 2021 data
- The EPA-EIA crosswalk is now integrated into pudl. We should look into whether we still need to separately download this
-
prime_mover_code
has moved fromgenerators_entity_eia
togenerators_eia860
since it is no longer considered a static attribute - Check for updates to the
operational_status_code
encoding - Several attributes were moved from
plants_entity_eia
toplants_eia860
, includingbalancing_authority_code_eia
,balancing_authority_name_eia
, grid voltage columns, and iso codes -
grid_voltage_kv
was renamed togrid_voltage_1_kv
- Check
balancing_authorities_eia
for changes to BA encoding, including changes to PACW and PACE - Several columns in CEMS were renamed:
unitid
->emissions_unit_id_epa
.facility_id
was dropped. - Missing values in CEMS are no longer replaced with zeros - no longer need method to re-create these missing values.
- May no longer need to convert
plant_id_eia
in CEMS toplant_id_epa
before converting toplant_id_eia
- There were changes made to
allocate_net_gen
before mergining - we should double check that everything still works.
grgmiller commented
Additional to dos:
- Update download functions to grab 2021 data from zenodo
- Double check load data functions for environmental tables to make sure workbook formats are the same for 2021.
- Consider deleting
load_data.crosswalk_epa_eia_plant_ids()
if it is not used by any functions anymore. pudl is doing this crosswalk but not including all manual CW. - Double check that we don't have any dependencies on eGRID since the 2021 version is not yet published
- Double check all of the manual cleaning of EIA-930 data is updated for year 2021 data
- Go through all files in
data/manual
to update for 2021 if necessary - Test that co2-eq functions are using AR6 for 2021 data
- Update the environment name after testing is done, and change pudl dependency
Meta to-do:
- Create a checklist of everything we need to check / update each time a new year of data is released
- Consider whether it is worth running the PUDL pipeline locally in the future instead of waiting for PUDL release.
grgmiller commented
First test the pipeline with year 2020 data after the update to make sure everything is working as expected and there are no major changes to the 2020 data, then run for 2021