ropensci/rinat

request limits information

Opened this issue · 4 comments

It would be helpful to have information about rate limits in the documentation.

"We throttle API usage to a max of 100 requests per minute, though we ask that you try to keep it to 60 requests per minute or lower. If we notice usage that has serious impact on our performance we may institute blocks without notification."

Not all functions have a maxresults that helps control this, and users may unknowingly trigger a block.

I had the same question. Trying to use this for a classroom assignment, but I have so many students I cannot download their info without triggering this. Is there a way around it that's straightforward?

Two strategies I've used as a work around:

  1. Since the limit is 60 requests per second, I break my requests into chunks of 60 and put a Sys.sleep(10) in between every chunk. I usually do this in a simple for loop (the horror, I know).
  2. If the observations are research grade, I just download the full data from GBIF and then subset what I need. The data is big, so I use data.table::fread to read it in quickly. Then I ditch most of the data that I'm not interested in, making the size pretty manageable, so that I can continue on working with data.frame.

Hope that helps in the meantime.

Just curious, what is the use case that will trigger this?

I had a bunch of observation IDs and wanted to grab all of the metadata (such as date/time uploaded to iNaturalist, date/time first identified, etc.). I was looping through each ID one at a time using the get_inat_obs_id function which only takes one ID at a time. This is one of the functions that doesn't have a rate limit forced internally, so the loop processed faster than recommended. There is probably a better way to do this, so feel free to share ideas!