Empty UV data causes error during combine
lindsayplatt opened this issue · 6 comments
I'm not sure how this is happening but one of the UV files doesn't have any data. This is causing an error within the combine_nwis_data()
function at the end of the UV task makefile in 10_nwis_uv_pull_tasks.yml
because it tries to pivot the data using convert_to_long()
and fails because the columns are missing.
> dat <- qs::qread("10_nwis_pull/tmp/uv_230505_137.qs")
> dat
# A tibble: 0 x 0
> convert_to_long(dat)
Error: Can't subset columns that don't exist.
x Column `agency_cd` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
I'm not sure why this single file is empty. The value of the partition information stored as uv_partition_230505_137
before passing to get_nwis_data()
is
# A tibble: 1 x 4
site_no count_nu PullTask PullDate
<chr> <dbl> <chr> <chr>
1 03501975 350496 230505_137 2023-05-05
So we would expect to see a lot of data. But if you manually pull data and check the inventory with the appropriate params, it correctly shows no data. So, it would appear that somehow the inventory value in count_nu
next to this site is incorrect. I'm not sure how to resolve that.
> readNWISuv("03501975", parameterCd = "00060", startDate = "", endDate = "")
data frame with 0 columns and 0 rows
> whatNWISdata(siteNumber = "03501975", service = "uv", parameterCd = "00060")
[1] agency_cd site_no station_nm site_tp_cd
[5] dec_lat_va dec_long_va coord_acy_cd dec_coord_datum_cd
[9] alt_va alt_acy_va alt_datum_cd huc_cd
[13] data_type_cd parm_cd stat_cd ts_id
[17] loc_web_ds medium_grp_cd parm_grp_cd srs_id
[21] access_cd begin_date end_date count_nu
<0 rows> (or 0-length row.names)
Definitely something wonky going on at that site. The data exist, but you can't get to the new gage pages for that site, and the discharge metadata look weird to me. I think it's something with the records, and I see the site has been discontinued, so that might be something to do with it. https://waterdata.usgs.gov/monitoring-location/03501975/#parameterCode=00065
But you can see the data at the old gage site (daily values) page: https://waterdata.usgs.gov/nwis/dv?referred_module=sw&site_no=03501975
Right, but I don't see any UV data on those pages, so its weird that the UV inventory returned something for count_nu
.
I need to keep moving forward with this for now. So, I am going to add a single line that skips over the code in convert_to_long()
if the data input has 0 rows. Here's the line I added in the function for now:
Oh strange, I think we are skipping this entirely within the dv
pull here then. Still so weird that my May 5 inventory claimed there were over 350k records and now there are 0. Maybe they were working on something with this site?
Yeah, maybe they're making these data available in the dv
database, and the mean
isn't an approved record yet. Not entirely sure, but the only reason it's being pulled by uv
is when it doesn't exist in dv
. Suggests to me that they are working on these records, as you suggest.
Got passed this with some code in #30 but not sure if there is some underlying issue we should investigate later, so leaving open for now :)