sboysel/fredr

Getting fredr back on CRAN

DavisVaughan opened this issue · 24 comments

Related to #75 - but I didn't want everyone in that thread to get notified.

@sboysel I'd be willing to take another look at this to get this back on CRAN

Honestly I now think the easiest thing to do is to back out your commit with the mocking features. Instead, to keep this on CRAN I would do the following:

  1. Skip all tests that require hitting the API if there is no auth token by using a custom skip_if_no_auth() helper in the tests. This would skip tests on CRAN so we never get kicked off again due to spurious failures. I'm pretty sure this is what we were doing before, with the exception of the fredr_docs() one that broke.

  2. Nuke appveyor and travis testing and update to github actions

  3. Most builds on github actions would run without an auth token. This would mimic running on CRAN.

  4. A single build on github actions would run with an auth token, and would run the full test suite. We could set this build up to run on each push to master / pull request, and also add a weekly cron job that would run the test suite just to make sure things are working. Yes this would take awhile, but 1 build is probably okay and it beats the pain of the mocking infrastructure.

  5. All examples would be updated from \donttest{} to \dontrun{}. This is what googledrive uses. I'm fairly certain CRAN will sometimes test your examples even if wrapped in \donttest{}. I think \dontrun{} also plays nicely with pkgdown if I'm remembering right, but we would make it so that the examples don't run on pkgdown since they could spuriously fail and break the site.

This is basically the setup that the googledrive package uses, and seems to work fairly well.

This would be a large amount of changes, so if this sounds good to you then I'd like to request that you give me permissions to push to the repo directly (I don't think I have that now)

Good to hear from you @DavisVaughan. Was just clearing the cobwebs from my R development environment before reading this issue. I think this all sounds great and I'm happy to follow the advice of someone more familiar with this whole process than I. I will give you push permissions to master and I sincerely appreciate your help. I can handle any additional changes and the CRAN resubmission process. Let me know what else I can do.

I've added you as a collaborator to the repository and I don't seem to have any special branch restrictions set up. Let me know if that's not enough access.

Thanks! Happy to help. I'll also need you to go into the Settings part of the repo and add a fredr token to the Secrets section so it can be used in github actions (I dont have access to Settings, but I think that is expected). You would create a Secret with the name FRED_API_KEY and then you'd add your key, and then we can set it as an environment variable in github actions with env: FRED_API_KEY = {secrets.FRED_API_KEY} or something similar to that

Done. FRED_API_KEY was already set so I just verified it's the same test key we've always used.

@sboysel could you please go into Settings -> scroll down to GitHub Pages -> change Source to the gh-pages branch (the folder should just be root, I think that is the default)

Could you also please turn off the AppVeyor Webhook? I think you would need to go into Setting -> Webhooks -> Delete anything that looks related to AppVeyor (or travis if you see it)

@sboysel could you please go into Settings -> scroll down to GitHub Pages -> change Source to the gh-pages branch (the folder should just be root, I think that is the default)

Done.

Could you also please turn off the AppVeyor Webhook? I think you would need to go into Setting -> Webhooks -> Delete anything that looks related to AppVeyor (or travis if you see it)

Done. Should we keep the Codecov webhook?

Should we keep the Codecov webhook?

Yes!

Thanks!

It seems I've hit the rate limit for the stored api key. I'm asking the people at FRED if they can up our limits (they specifically say that you can ask them to raise it here https://fred.stlouisfed.org/docs/api/fred/errors.html)

Once the dust settles, the cron jobs that run weekly shouldn't exceed that limit, but in the meantime its annoying.

I've learned that the rate limit is 120 requests per minute. This is pretty generous, and is hard to hit unless you are either:

  • Constantly requesting short series from a fast responding endpoint
  • Running multiple requests in parallel using the same api key

The second is what happens when I run the test coverage build. It runs test coverage with the api key + the R CMD Check build that uses the api key.

Now that we know what the rate limit is, I've added in some handling for it. We now sleep for 20 seconds when we get rate limited (the http code is 429 so we can detect this), then try again. We try for a max of 6 times which should be enough to unlimit ourselves (2 full minutes is bound to be enough time...)

I can test that it works pretty well with this, which does rate limit me at around request 120

for (i in 1:500) {
  fredr("UNRATE", observation_start = Sys.Date() - 50L)
  message("request ", i)
}

@sboysel I think this is ready for a CRAN release. I've got a release candidate PR ready to go #84.

If you approve, then I can release it with devtools::release() and then you just have to accept the email (since you are the Maintainer).

Overall changes:

GitHub Actions

Moved away from Travis and Appveyor to GitHub Actions. The "workflows" are in .github/workflows/

  • R-CMD-check.yaml runs check() across many versions of R and many OSs (Mac, Windows, Linux). However it does not run with an API key. This mocks how we will run tests and examples on CRAN.

  • R-CMD-check-auth.yaml runs check() on the latest version of R on a Mac and does run with the API key. This build runs on all pushes to master and all PRs. The total check time is around 4 minutes, so it seems reasonable to do this on 1 build at all times.

  • test-coverage.yaml runs covr() on the package. It always runs with an API key to get accurate coverage.

    • It runs on a cron job every monday at 1am UTC

    • It runs whenever you push to master and have [covr] in the commit message

  • pkgdown.yaml builds the pkgdown site and automatically pushes it to the gh-pages branch. It always runs with an API key to have nice example output.

    • It runs on a cron job every monday at 12am UTC

    • It runs whenever you push to master and have [pkgdown] in the commit message

Examples

Examples have been changed from donttest{} to using the new fredr_has_key() function. This makes the examples run whenever a key is available (like with pkgdown), but otherwise they are skipped (like on CRAN). This is what googlesheets4 does. Eventually roxygen2 will support this more cleanly with the @examplesIf tag, but this PR hasn't been merged yet r-lib/roxygen2#993

Tests / Check

All tests that depend on a key use skip_if_no_key() to avoid failures on CRAN. Our github checks that run without an API key keep us honest and ensure that everything will work when we submit to CRAN without a key.

The entire check takes around 30 seconds without a key, and 4 minutes with a key.

Each R/<name>.R file now has a corresponding tests/testthat/test-<name>.R file. This makes it easier to follow as devtools::test() runs. It has the side benefit of making usethis::use_test() jump directly to the corresponding test file when you call it without any args when you have the R/<name>.R file open.

Changes

  • New fredr_has_key() and fredr_get_key() which are mainly just utility functions

  • Refreshed all vignettes, the readme, and many function docs

  • A number of FRED URLs had to be updated since they redirect. It is a new-ish thing in R's check process where they don't want URLs to redirect. I just replaced them with what they redirect to.

  • All functions can now retry a request if we hit a rate limit. See my previous github comment for more information.

  • Overhauled the infrastructure behind capture_args() to be a bit safer. We now validate the type/size of every arg that comes through.

  • I removed the default of NULL for required arguments. Generally the best practice is to leave these without an argument, and only give defaults to optional arguments. This especially makes it less confusing for fredr_category_related_tags() where both category_id and tag_names are required.

  • I have also placed ... between the required arguments and the optional arguments in a function signature. Internally I check that these ... are empty. Placing the ... makes it so that optional arguments have to be specified by name, resulting in more self documenting code. Checking that ... are empty prevents the user from making a typo and having it get swallowed silently. This is something that we are using more and more in the tidyverse to help users write better code. It also means we can add extra optional arguments whenever we want without fear of breaking any code (since they had to be specified by name).

This is amazing work, @DavisVaughan ! A HUGE thanks for helping out and doing a much better job than if I handled it alone. Great to learn all these tricks for R packaging.

I've looked over your changes and it all looks great to me. I'll gladly defer to your expertise on areas I'm less familiar with.

Yes, please do submit via devtools::release() and I will monitor my email. I will make announcements on the remaining issues (#75) and Twitter. I will also close #82 once the package has been accepted.

I'm happy to help!

I have released fredr 2.0.0, you should have received an email.

I will also close #82 once the package has been accepted.

Just let me know when you hear back from CRAN (acceptance or otherwise). If accepted, I'll merge in #84 and that will take care of closing all the linked issues. Then I'll update the version number again to bump it to the next development version and that should be it.

Hmm, I don't see fredr in the CRAN ftp incoming site anymore, but it isn't on CRAN. Did we get rejected?

Just got this email from CRAN last night:

Thanks,

Please always make sure to reset to user's options(), working directory
or par() after you changed it in examples and vignettes and demos.
e.g.:

old <- options(digits = 3)
...
options(old)

Additionally:
Have the issues why your package was archived been fixed?
Please explain this in the submission comments.

Please fix and resubmit.

Best,
Gregor Seyer

  1. Seems I was using options(digits = 4) in each of the vignettes (just checked and does not appear anywhere else)
  2. I didn't get to see the actual errors CRAN was citing before the package was dropped. Naturally, the web check results page they directed me to is no longer active. Should it just suffice to say the summary of changes in 2.0.0 address issues with our testing suite that led our package to be dropped? Sorry I cannot provide more useful details.

@sboysel - I've resubmit fredr to CRAN

Can you please check your email for another CRAN submission email and accept the submission?

Additionally, can you please respond to Gregor with something like:

Gregor,

Thanks for the review! We have resubmit fredr.

- We have removed the calls to `options()` in the vignettes.

- We added comments about why the package was archived, and how we fixed it. We were hitting a URL that is no longer valid, and this broke a test.

Thanks,
Sam

I find that when I reply nicely saying that we resubmit and fixed their issues, I get a faster turnaround time.

Thanks @DavisVaughan. Will do, good suggestion on the email followup. Will update you when I hear back.

Just received word that the latest submission of fredr is heading to CRAN!

Alright. All merged and cleaned up!

Thanks for all the help, @DavisVaughan!

On the home page of the github repo, you link to the pkgdown site with HTTP, you may want to switch to HTTPS! (Both work, you should just have to change the url on the repo page)

Will do. Also making a note here to change the Shields.io badges in README.md / README.Rmd so that the proper CRAN version appears.