rstudio/renv

renv failing on github actions with recent package update

majazaloznik opened this issue · 12 comments

I have a R CMD ckeck workflow with renv suddenly failing on github actions with the errors failed to retrieve package 'xxx' and warning Warning: failed to find source for 'xxx' in package repositories or 1: curl: (22) The requested URL returned error: 404 2: failed to find binary for 'xxx' in package repositories .

It is conspicuously not failing on the mac runner but is failing on the windows and all three ubuntu ones.

I am posting this because the errors were not particularly googlable, so hopefully this will be helpful to someone else. It turns out the reason for the failures was that the package in question had just been updated a few days ago. I used renv::install(package@1.2.3) to go back one version and that solved the problem.

I am also posting this in case there is another less kludgy solution to this issue, because I only half understand what I am doing here :) . My workflow is below.

workflow.yaml
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
  push:
    branches: [main, master]
  pull_request:
    branches: [main, master]

name: R-CMD-check

jobs:
  R-CMD-check:
    runs-on: ${{ matrix.config.os }}

    name: ${{ matrix.config.os }} (${{ matrix.config.r }})

    strategy:
      fail-fast: false
      matrix:
        config:
          - {os: macos-latest,   r: 'release'}
          - {os: windows-latest, r: 'release'}
          - {os: ubuntu-latest,   r: 'devel', http-user-agent: 'release'}
          - {os: ubuntu-latest,   r: 'release'}
          - {os: ubuntu-latest,   r: 'oldrel-1'}

    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
      R_KEEP_PKG_SOURCE: yes

    steps:
      - uses: actions/checkout@v3

      - uses: r-lib/actions/setup-pandoc@v2

      - uses: r-lib/actions/setup-r@v2
        with:
          http-user-agent: ${{ matrix.config.http-user-agent }}
          use-public-rspm: true
          r-version: 4.1.2

      - name: Remove `.Rprofile`
        shell: bash
        run: |
          rm .Rprofile
      - uses: r-lib/actions/setup-renv@v2

      - uses: r-lib/actions/check-r-package@v2
        with:
          upload-snapshots: true

I've run into a similar issue (carpentries/actions#69). In my case, renv::install() detected that it was on Linux and was looking at the PPM (aka RSPM), which is ~1 week delayed from CRAN.

I'm wondering: could options(install.packages.check.source = TRUE) allow {renv} to detect the correct version or is there a different fix?

Based on the information in carpentries/actions#69, it seems like the package was originally installed from CRAN, and then later re-installed from PPM (which was behind CRAN in this case, unfortunately).

If a source version of data.table 1.14.8 had been available on PPM at that time, renv would have chosen to install it; however, because PPM was behind it ended up behaving like a downgrade.

I feel like part of the problem is that the existing GitHub actions using renv set this:

2023-02-20T11:39:17.8458383Z RENV_CONFIG_REPOS_OVERRIDE: https://packagemanager.posit.co/cran/__linux__/jammy/latest

So effectively, renv doesn't even look at CRAN; only PPM. And different parts of the workflow appear to be using these different repositories, leading to this confusion.

I don't have a great answer beyond recommending a workflow with something like the following:

  1. Call renv::snaphsot() locally to capture the requisite packages + their versions for your project,
  2. On CI, call renv::restore(), then take the project actions you need to.

I'm not sure how to put this together best with the existing r-lib GitHub Actions, though. 😞

See r-lib/actions#702 (comment): the fact that the action is setting RENV_CONFIG_REPOS_OVERRIDE is driven be use-public-rspm being true, just set it to false and the action would (should) honor what is in the lockfile.

I think this is somewhat related to #1001 because {renv} is relying on the definition of the repo: lockfile parameter for guidance.

The implementation of RENV_CONFIG_REPOS_OVERRIDE allowed us to mask the CRAN repository with our own, but it came at the unfortunate cost of entirely ignoring CRAN as a potential source.

One thing I notice is that {renv} sets a FALLBACK repository in case there are no other repositories available:

renv/R/bootstrap.R

Lines 53 to 59 in 094c01d

default <- c(FALLBACK = "https://cloud.r-project.org")
extra <- getOption("renv.bootstrap.repos", default = default)
repos <- c(repos, extra)
# remove duplicates that might've snuck in
dupes <- duplicated(repos) | duplicated(names(repos))
repos[!dupes]

If somehow the user has options(repos = NULL), then the value for the repos would then end up as FALLBACK = "https://cloud.r-project.org" and then somewhere in the mechanism for updating the lockfile, {renv} knows that it needs to replace FALLBACK with CRAN.

I think a solution might be to either:

  1. detect RSPM in the CRAN override and set CRAN as a fallback OR
  2. allow users to additionally specify a FALLBACK repo along with REPOS_OVERRIDE OR
  3. allow users to set a REPO_ALIAS var that will allow a list of repositories to be treated identically.

I would be happy to submit a PR implementing one of these if they are the right options.

Some other thoughts:

  • Support multiple repositories declared in RENV_CONFIG_REPOS_OVERRIDE, perhaps using ; as a separator, or detecting and parsing JSON like { "RSPM": "...", "CRAN": "..." }.
  • Make it easier to set repositories via the R option; e.g. I'm guessing there's a way in GitHub actions to write options(renv.config.repos.override = <...>) into a .Rprofile?
  • Support multiple repositories declared in RENV_CONFIG_REPOS_OVERRIDE, perhaps using ; as a separator, or detecting and parsing JSON like { "RSPM": "...", "CRAN": "..." }.

I'm wondering how this would interact with something like #1001 and wondering if it might be better to introduce something like RENV_CONFIG_REPOS_EQUIV where you can delcare repositories as equivalent: {"name" : "CRAN", "RSPM" : "...", "CRAN", "...", "siloed-backup", "..."} This way, you could list all of the repositories that are equivalent to each other and still have a separate repository (e.g. drat or r-universe) that contains packages that are explicitly NOT on CRAN, which could look something like this in the .Rprofile

options(repos = c(RSPM = ...,
  "siloed-backup" = ...,
  CRAN = ...,
  universe = "https://zkamvar.r-universe.dev"))
options(renv.config.repos.equiv = c(name = "CRAN", 
  RSPM = ...,
  "siloed-backup" = ...,
  CRAN = ...))

and this would have the effect of RENV understading that anything coming from RSPM, siloed-backup, or CRAN would be labelled as "Repository" : "CRAN" in the lockfile, but something coming from the universe would be "Repository" : "universe"

@kevinushey, @zkamvar, @majazaloznik, I am just curious if all this discussion is still relevant in the context of GHA workflows using setup-r with use-public-rspm: false. This should in principle let renv honor what it locked, which should work provided the lockfile is somewhat consistent, i.e. all the locked package versions are installable from the locked repos (and also for the locked R version I suppose). If not such "consistency" is guaranteed (I understand the OS can play a big role here), then I guess there is room for improvement in renv itself.

As for arbitrarily not honoring the repos in the lockfile in GH Actions, I think this is in general very specific and should be an explicit choice, beyond the usage use-public-rspm: true (which, if you ask me, should never be used with renv by default).
For this, maybe a good solution is to expose the repos argument of renv::restore() ("When set, this will override any repositories declared in the lockfile.") as input to r-lib/actions/setup-renv, stg like

- uses: r-lib/actions/setup-renv@v2
  with:
    repos: 'c("siloed-backup" = ..., CRAN = ..., universe = "https://zkamvar.r-universe.dev")'

to be then used in https://github.com/r-lib/actions/blob/33f03a860e4659235eb60a4d87ebc0b2ea65f722/setup-renv/action.yaml#L44

Otherwise, support for multi-repo RENV_CONFIG_REPOS_OVERRIDE as suggested would also be a valuable alternative, with the advantage of allowing this to be set for the whole workflow, perhaps even done by r-lib/actions/setup-renv based on an exposed input similar to the above.

One small remark on the FALLBACK (#1147 (comment)): This only affects where renv is installed from at bootstrap/activation time.

Not sure if this is related but my GH Actions often fail to install renv and I don't really see why:

Run install.packages("renv")# Bootstrapping renv 0.17.2 --------------------------------------------------

  • Downloading renv 0.17.2 ... FAILED
    Error: Error in bootstrap(version, libpath) : failed to download renv 0.17.2
    Calls: source ... eval.parent -> eval -> eval -> eval -> eval -> bootstrap
    Execution halted
    Error: Process completed with exit code 1.

Not sure if this is related but my GH Actions often fail to install renv and I don't really see why:

Run install.packages("renv")# Bootstrapping renv 0.17.2 --------------------------------------------------

  • Downloading renv 0.17.2 ... FAILED
    Error: Error in bootstrap(version, libpath) : failed to download renv 0.17.2
    Calls: source ... eval.parent -> eval -> eval -> eval -> eval -> bootstrap
    Execution halted
    Error: Process completed with exit code 1.

@black-snow I think this is because you are using RSPM (line 34 of your workflow file has use-public-rspm: true) and the Posit/RStudio Package Manager (as far as I can work out) sometimes falls a few days behind CRAN, i.e., version 0.17.2 isn't there yet, so that's why your workflow fails.

According to https://packagemanager.posit.co/client/#/repos/2/overview there was a snapshot taken on 17th March, but I guess at a time on that day just before 0.17.2 landed on CRAN, so RSPM has 0.17.1 at the moment.

If you comment out use-public-rspm: true or set use-public-rspm: false then you'll obtain the latest renv from CRAN. You can re-enable RSPM in your workflow once RSPM has been updated.

@remlapmot thanks! I didn't know that / notice. I'll give it a spin.

@remlapmot thanks! I didn't know that / notice. I'll give it a spin.

RSPM was updated yesterday,

Screenshot 2023-03-21 at 09 26 41

so I think it should have renv 0.17.2 now.

I think the issues in this thread appear to have been largely resolved; if not, please file a new issue.