DOI-USGS/EGRET

Error in if (lastMonth == 2 & (lastYear%%4 == 0) & ((lastYear%%100 != : missing value where TRUE/FALSE needed

raruggie opened this issue · 9 comments

I am getting the error in the title when trying to use the setupYears function. I have tried this for multiple sites ("USGS-01358000", Hudson River at Green Island, and "USGS-04213500", CATTARAUGUS CREEK AT GOWANDA NY) and get the same error. Could it be related to the fact that I have not trimmed down the flow data period of record to match the water quality data period of record?

Here is my code:

## set USGS site

s1<-"USGS-01358000"
s1_NWIS<-substr(s1, 6, nchar(s1)) # set a NWIS format site by removing the USGS prefix

## set flow and Nitrogen USGS Pcode

flow<-'00060'
Nitro<-'00631'

## download Daily, Sample, and Info dataframes using EGRET functions

Daily <- readNWISDaily(s1_NWIS, flow, startDate = "", endDate = "")
Sample<-readWQPSample(s1,Nitro,startDate="",endDate="")
INFO <- readNWISInfo(s1_NWIS,Nitro,interactive = FALSE)

## merge the Daily, Sample, and Info dataframes into an eList object

eList <- mergeReport(INFO,Daily,Sample)

# eList <- as.egret(INFO,Daily, Sample) # what is the difference between mergeReport and as.egret functions in the user manual?

## Run WRTDS model. 

mE <- modelEstimation(eList, windowY = 7, windowQ = 2, windowS = 0.5, minNumObs = 60, minNumUncen =50, edgeAdjust = TRUE) # default values other than minNumObs (100->60)

## compute an annual results df

AnnualResults<-setupYears(mE$Daily) # which gives the error: Error in if (lastMonth == 2 & (lastYear%%4 == 0) & ((lastYear%%100 !=  : missing value where TRUE/FALSE needed

Also, is there a difference between the mergeReport and as.egret functions for initiating ``eList```? I couldn't find an explanation in the EGRET user manual (https://pubs.usgs.gov/tm/04/a10/pdf/tm4A10.pdf).

Bob Hirsch,
Thank you for the response. I don't see any attached code to this github posting, nor was there a file attached to the emailed version of your response to my inbox. However, when I adjusted the start and end dates I no longer got the error, thank you. Just to clarify, when you say an interactive session, you mean to use some data visualization functions (e.g. multiPlotDataOverview(eList)) to look at the data?

I don't understand from your response what the issue is with using readWQPSample(). As to why I am using readWQPSample(), I was told by Laura DeCicco that in the dataRetrieval package, water quality data downloading was no longer going to be possible using any of the NWIS flavored functions (see DOI-USGS/dataRetrieval#643). I assumed the same discontinued use of NWIS functions for water quality data carried over into EGRET?

-Ryan

Bob,
Thank you for the response. I am still confused because I thought WQP included all the data sources for a site (e.g. EPA, state, NWIS) where as water quality data from NWIS would just be samples taken by the USGS. How I can tell if a site has its better data on WQP or through NWIS (i.e. better quality and quantity)? Would I need to run both functions and look at the resulting data frames?

I just wanted to make sure: it is okay to use NWIS functions for water quality data if they are part of the EGRET package, but what Laura said about dataRetrieval is still accurate, i.e. don't use NWIS and use the WQP functions? For example, would you trust this code for finding sites in New York that have Nitrogen data?:

library(dataRetrieval)
Nitro<-'00631'
NY_sites_N<- whatWQPdata(statecode = '36', 
                         parameterCd = Nitro,
                         siteType="Stream")

I would then use the list of these sites with EGRET functions to find trends for each site. Does this sound okay?

-Ryan

Bob,
Thank you, I'm sorry for the confusion in my last comment. This clarify things.
-Ryan

Sorry I was on leave last week and couldn't contribute to the conversation.

To expand on Bob's points:

The "NWIS" functions are still working. At one point, we were told they were going to be shut down imminently, but the date for that shut-down continues to be pushed back. USGS data itself should be identical between NWIS and WQP (not including the formatting), however we've recently discovered a few sites that are not including some censoring limits which is very important for EGRET analysis. The mapping from our legacy NWIS system to the WQP format is being updated so those censoring values will be there in the (hopefully near) future. So in the meantime, I'd just recommend sticking with readNWISsample. We'll do a pretty heavy outreach campaign when we know for sure when the NWIS qw services will be shut down. The other difference with WQP vs NWIS is that WQP doesn't sort the data by measurement date. We've added that sorting to EGRET. To get that update you'd need to be using the GitHub version of EGRET though (not what's currently on CRAN).

You can use the whatWQPdata function as you described. For any USGS site, you can strip the "USGS-" from the monitoring location and get the sampling data from readNWISSample. For any site (USGS or otherwise), you'll need to think about where to get the discharge record. USGS discharge will continue to be available from the readNWISDaily function.

I'm leaving this Issue as active so I can see if I can make the error message in setupYears more clear for the situation you described. Thanks for reporting it!

I thought I'd expand on the answer for finding "all the data" (ie, if you were looking for all the nitrogen in NY):

If you are looking for "all-the-data", pcode is not the way to go, you'll only find USGS sites if you specify the query based on a parameter code.

I've been working on another project where we create water quality reports based on a shapefile (in the specific package, it is a FWS Wildlife Refuge, but we could tweek it to more general shapefiles with a little work).

https://rconnect.usgs.gov/wqReport

The way we define "all-the-data" in that project was a parameter lookup file:
https://rconnect.usgs.gov/wqReport/articles/customize_report_data.html#custom-parameter-lookup-file

We define characteristic name, sample fraction, and unit to be one "set", and we offer unit conversions to get more data to group together if it's appropriate (ie a simple unit conversion mg/l to ug/l for instance).

I could see where you might then take this data and do WRTDS analysis... finding the appropriate discharge measurements might end up taking a lot of time/effort. I'm showing you these examples because they might be useful to see how others group water quality data.

FYI if you use the GitHub version of EGRET (and soon will be pushed to CRAN), you will see a more informative message:

Error in setupYears(mE$Daily) : 
  Daily dataframe cannot have gaps in the data.