bluegreen-labs/ecmwfr

Failed HTTP request (500) in update_status() breaks wf_request() for CDS

mpaulacaldas opened this issue ยท 8 comments

Similar to the situation described in #105, I continue to get an error from a NULL private$status, despite using the latest dev version of the package (which should have fixed the issue via #116)

No encoding supplied: defaulting to UTF-8.
Error in if (private$status == "completed") { : 
  argument is of length zero

Exploring a bit on my end, I find the problem comes from update_status(). It seems like, at times, the CDS API returns a 500 (perhaps the requests are coming in a bit too fast to the server). When this happens, the resulting ct$state object is therefore NULL, which then raises an error on the subsequent if statements.

ecmwfr/R/service-cds.R

Lines 81 to 92 in 2e38b59

response <- httr::GET(
private$url,
httr::authenticate(private$user, key),
httr::add_headers(
"Accept" = "application/json",
"Content-Type" = "application/json"
),
encode = "json"
)
ct <- httr::content(response)
private$status <- ct$state

I have fixed this for myself by returning the self object early if the response fails (and therefore, prompting a new request, which tends to succeed)

      response <- httr::GET(
        private$url,
        httr::authenticate(private$user, key),
        httr::add_headers(
          "Accept" = "application/json",
          "Content-Type" = "application/json"
        ),
        encode = "json"
      )

      if (httr::http_error(response)) {
        return(self)
      }

      ct <- httr::content(response)

I don't submit a PR since this is likely a "band-aid" fix, but hope it helps pinpoint the larger issue ๐Ÿ˜„

Having this same issue :(
So, how do I fix this? Do I download the code, make the fix above and compile it? I don't have experience with that. Is it easy?

I am also experiencing the same error as of this week. The above fix from @mpaulacaldas did not work for me.

No encoding supplied: defaulting to UTF-8.
Error in if (private$status == "completed") { :
argument is of length zero

Having this same issue :( So, how do I fix this? Do I download the code, make the fix above and compile it? I don't have experience with that. Is it easy?

@mirizarry-ortiz if you want to give it a try, you can install the package from a branch where I implemented my fix:

remotes::install_github("mpaulacaldas/ecmwfr", ref = "fix-update-status")

As I mentioned, I didn't submit a PR because this fix isn't super robust. If I have a bit more time, I'll do some further exploration and report back here.

@mpaulacaldas thanks for raising the issue and reporting here - much appreciated

I'm currently moving positions / countries, so I will have little time to look into this. If you manage to create a more robust solution feel free to do a PR!

Fixes

I'm pretty sure the retry frequency is too high (in polling the service for the status), which "spams" the service during the CDS upgrade (especially). I have a hunch they might be rate limiting requests (max number of calls per hour). Reducing the retry frequency, and/or making this dynamic would give more granular control on this.

private$retry <- 5

Adjust this to fail gracefully on a 500 error. This would mean getting kicked out of the download routine (but not cancelling the request). This would be the most clean setup if also fixing a potential parameter to set the retry rate.

if (private$status != "completed" || is.null(private$status)) {

This should then return the following message:

Download failed due to service interruption (http 500 error). Please lower the retry rate on the download routine.

Hi @mpaulacaldas, @mirizarry-ortiz, @jonnyrbomber , the above fixes are implemented in #127 (basically trapping the 500 error more cleanly and allowing for setting your retry rate for a download). The retry rate is set at a default of 30s. So after submission of a query your first try at downloading it will be 30s later. This might address some of the html 500 errors due to server errors. However, the communication of ECMWF is lacking on when and how the services are unstable. The issues you describe may persist regardless, and most likely will not resolve until the roll-out of the new API (which I hope will allow for easy adoption of my current framework). Any help on this later this spring is always appreciated.

Thanks so much for this fix @khufkens! I had to step away from the project where I am using {ecmwf}, but I'll give this a try when I get back to it and of course, report anything else that seems off and attempt a fix if useful.