USGS-R/Rainmaker

How to handle data using year-round central standard time?

Closed this issue · 2 comments

Need to add a way to handle data recorded in CST year-round. Precip that occurs during the 2:00am hour on daylight savings day currently causes NA's, which makes RMevents fail.

One idea is to convert the date/time to UTC (aka GMT) during RMprep, run the remaining functions (RMevents, RMintense, RMerosivity, RMarf), then convert back to CST (or whatever timezone) at the end while building the output tables.

Open to other ideas.

tz_example.txt

>PrecipPrep <- RMprep(df=tz_example,prep.type = 3,date.type = 4, cnames.in="VALUE", cnames.new="rain") #this is for .csv
> PrecipEvents <- as.data.frame(RMevents_sko(df = PrecipPrep, ieHr = 2, rainthresh = 0.01, rain = "rain",time = "pdate")[1])
Error in if (dif_time[[i - 1]] >= ieMin) { : 
  missing value where TRUE/FALSE needed

It works if I assign tz="UTC", but the time would need to be converted back to 'local' at the end:

> PrecipPrep <- RMprep(df=CSV,prep.type = 3,date.type = 4,cnames.in="VALUE",cnames.new="rain",tz="UTC") #this is for .csv
> PrecipEvents <- as.data.frame(RMevents_sko(df = PrecipPrep,ieHr = 2,rainthresh = 0.01,rain = "rain",time = "pdate")[1])

Here is the RMevents function - updated to handle files without "0" in the rain value column:

#' RMevents_sko function:
#' Rainfall event determination
#' 
#' @description
#' Compute rainfall event variables based on time series of rain data with only one rain
#' gage or one mean radar rain column.
#'
#' @param df dataframe with rainfall
#' @param ieHr numeric Interevent period in hours, defaults to 6, 
#' @param rainthresh numeric Minimum event depth in units of the rain column, default is given as 5.1 assuming millimeters (0.2")
#' @param rain string Column name of rainfall unit values, defaults to "rain"
#' @param time string column with as.POSIXctdate, defaults to "pdate"
#' @param tz define the timezone the data were collected in. For more information on timezone codes see https://en.wikipedia.org/wiki/Coordinated_Universal_Time 
#' @return list of all rain events that surpass rainthresh (storms2) and all rain events (storms)
#' @export
RMevents_sko <- function(df,ieHr=6,rainthresh=5.1,rain="rain",time="pdate"){
  
  ieMin <- ieHr * 60 # compute interevent period in minutes
  dateOrigin <- as.POSIXct('1884-01-01 00:00',origin = '1884-01-01 00:00')
  
  
  df <- df[df[rain] != 0,]
  df <- df[df[rain] > 0.00001,]
  df["event"] <- NA
  df[1, "event"] <- 1
  
  dif_time <- diff(df[[time]])
  timeInterval <- min(dif_time)
  
  # loop that assigns each row to an event number based on dif_time
  for (i in 2:nrow(df)){
    if (dif_time[[i-1]] >= ieMin) {
      df$event[i] <- df$event[i-1] + 1
    } else {
      df$event[i] <- df$event[i-1]
    }
  }
  
  rain.events <- aggregate(x = df$rain, by = list(df$event), sum) #find sum of rain in each event
  start.dates <- aggregate(x = df$pdate, by = list(df$event), min)[,2] #find minimum date for each event
  start.dates <- start.dates - timeInterval
  end.dates <- aggregate(x = df$pdate, by = list(df$event), max)[,2]
  
  out <- data.frame(stormnum = rain.events[,1],
                       StartDate = start.dates,
                       EndDate = end.dates,
                       rain = rain.events[,2])
  out2 <- subset(out, rain >= rainthresh, row.names = FALSE)
  return(list(storms2 = out2, storms = out))
}
  

I think this needs to be addressed and made exceptionally obvious in the vignette.

I am pretty sure this issue has been resolved with updates to RMprep and RMevents, and by using an appropriate timezone code such as "ETC/GMT+6".