singularity-energy/open-grid-emissions

Export hourly individual plant data for shaped plants

Closed this issue · 0 comments

Currently our hourly plant-level results are aggregated into two different files: the hourly data directly from CEMS is reported at the plant level, but the shaped EIA data is aggregated to the fleet data. Part of the reasoning for this was because there are simply a large number of shaped EIA plants, so attempting to shape each individual plant was causing memory errors (see #86 (comment)).

However, in speaking with some data users, it hourly plant level data for all individual plants would actually be useful, as the fleet-level aggregations are not useful for certain research. In addition, splitting the data into separate files based on the method used can cause issues if different methods are used for data from a single plant - hourly data for a single plant is in two different places so would need to be aggregated by the user to be complete (see #238).

One potential way around the memory issue would be if we shape the individual plant data in chunks only during export, and do not actually store a full version of the shaped plant-level data in memory. We'd want to export the data in chunks (eg for each BA, or each state) so that each file isn't huge. Once we export the hourly plant level data, we could apply the existing aggregated shaping method since that's all we need for the subsequent power sector and consumed outputs.