Merging of the station records at each site including historical stations
Closed this issue · 1 comments
In a level_4 folder, having one merged record for each site, combining historical, v2 and v3 stations as well as moved stations (e.g. THU_U replaced by THU_U2). Ongoing implementation in https://github.com/GEUS-Glaciology-and-Climate/pypromice/blob/join_l4/src/pypromice/process/join_l4.py with some updates in other files (main...join_l4).
It uses is a list of the latest stations (as keys) and old stations in reverse chronological order:
pypromice/src/pypromice/process/join_l4.py
Lines 12 to 35 in 97eaedb
At the moment join_l4 is called on the same list of stations as join_l3, meaning sites for which new transmission, new raw files or new flags have recently been added:
https://github.com/GEUS-Glaciology-and-Climate/aws-operational-processing/blob/b0d52ecf9427b204460f21f110ef0e049d0c49c4/l3_processor.sh#L173-L185
If a station is listed in old_name .values()
(names in brackets in old_name
) then it is not processed by join_l4
(because appended to another AWS data). If a station is not in old_name.keys()
then there's no historical data that needs to be appended and it is copied, as-is to the level_4
folder.
For the historical GC-Net stations, the aliases for variables are defined in an external file src/pypromice/process/variable_aliases_GC-Net.csv
also defined as package data.
The merging is done by time slices:
pypromice/src/pypromice/process/join_l4.py
Lines 229 to 232 in 97eaedb
where
ds1
is the current AWS data and ds2
is the historical AWS data being appended before the start of ds1.Gap-filling during the overlapping period is currently not implemented.
The result are files of identical format and same variables as the level_3 files.
Instead of stid
there is now a site_id
and list_station_id
attributes defined as:
pypromice/src/pypromice/process/join_l4.py
Lines 271 to 278 in 97eaedb
meaning that we drop the the
v3
and the 2
in CEN2
(and potentially other stations)
Right now, because of the parallel call to join_l4
, join_l4
cannot know that it needs to re-append a given site (e.g. CEN) if the older station data (e.g. CEN1) is updated but not the latest station (e.g. CEN2).
fixed in #294