Better handling for cached 'node_modules' folder

Question

Better handling for cached 'node_modules' folder

leethree opened this issue 8 years ago · 8 comments

Use case

We are using patch-package to fix some nasty dependency issues, it works great during development. But it sometimes causes CircleCI failures when a partially patched node_modules folder is cached by CI.

The issue

Install patch-package and add it as prepare script, as instructed by README.
Patch module foo-bar using patch-package foo-bar. Commit the resulted patches/foo-bar+X.X.X.patch to repo.
When CI is run, CircleCI installs and patches the node_modules folder and saves the result to cache.
Modify module foo-bar and regenerate patch file.
CircleCI restores the previous cached node_modules folder (which has incorporated previous version of the patch).
CircleCI fails when applying new patch to node_modules:

**ERROR** Failed to apply patch for package foo-bar

  This error was caused because Git cannot apply the following patch file:

    patches/foo-bar+X.X.X.patch

  This is usually caused by inconsistent whitespace in the patch file.

Proposed solutions

Add a checksum file for the patches folder so we can easily check if patches has been changed before restoring cache from CI.
Or, add a operation that can revert applied patches, such as patch-package revert. So we can revert the patches and save the clean node_modules folder to cache.

Answer 1 · 2018-01-04T13:44:26.000Z

Hey, thanks for the solid issue report! I see the problem, and so far I like the second solution you proposed. Not sure I understand exactly how the first would work, could you detail that for me?

In the second case, would the revert command be called after everything else in the CI sequence, as a kind of teardown operation? I'm not familiar with how CircleCI does caching.

Answer 2 · 2018-01-04T15:32:59.000Z

Here's a snippet of our current CircleCI steps:

      - restore_cache:
          key: node-modules-cache-{{ checksum "yarn.lock" }}
      - run: yarn install
      - run: yarn test  # Run tests
      - save_cache:
          key: node-modules-cache-{{ checksum "yarn.lock" }}
          paths:
            - ./node_modules

Note the checksum part. The cache key includes the checksum of yarn.lock file. So CI will restore cache from previous builds only when yarn.lock is unchanged. But there's no easy way to check if the patches have been changed.

In the first proposed solution, we could change the cache key to something like:

node-modules-cache-{{ checksum "yarn.lock" }}-{{ checksum "patches/lock" }}

Then the cache will miss and CI will rebuild node_modules when patches/lock file has been changed.

In the second proposed solution, we could simply add a step to revert the patches before saving cache:

      - restore_cache:
          key: node-modules-cache-{{ checksum "yarn.lock" }}
      - run: yarn install
      - run: yarn test  # Run tests
      - run: yarn patch-package revert  # <-- Add this line
      - save_cache:
          key: node-modules-cache-{{ checksum "yarn.lock" }}
          paths:
            - ./node_modules

So the cache does not contains patches.

Answer 3 · 2018-01-25T21:18:12.000Z

Apologies for the delay in getting back to this.

For the checksum thing, I see how that works now and it seems like a good solution to me, but I'm not sure that it makes sense for patch-package to provide as a feature, since it's specifically related to working around expressivity issues with CircleCI's cache configuration. It seems like it should be possible to find a solution within the circle build by generating the lock file before restoring the cache with a command like cat patches/**/*.patch | md5 > patches/lock

If that's not possible for some reason then the second solution is definitely doable. Without patch-package's help you could apply the patches in reverse using the unix patch --reverse command. I think it makes sense to add this to patch-package's api though. It would be quite cheap to implement, test, and maintain, so if it helps just one or two people out it'll be worth it.

Answer 4 · 2018-02-01T09:49:29.000Z

Thanks for you reply. reverse would be very helpful.

Answer 5 · 2018-02-02T18:54:18.000Z

+1, same issue here with CircleCI. reverse seems like a neat workaround, although a lockfile would be more natural imo.

Answer 6 · 2018-02-09T11:38:45.000Z

Hi peeps. I just published v5.1.0 of patch-package which has a --reverse option. Thanks for your patience. 🙇

Answer 7 · 2019-10-16T14:33:14.000Z

I just ran into the exact same issue. Unfortunately I don't think --reverse can help in our case—we don't want to revert the patches because the version of node_modules at the end of the build will be cached but also it is what will be used to run the application, so the patches need to be applied.

I'm not sure there's any solution to this problem, but at the very least I hope we can improve the error message, so it's clear the patch failed because it's being applied to something that has already been modified. The patch could carry a hash of the original file, and then when the patch is applied, if the hash of the target file doesn't match the original, the patch would fail with an error along the lines of "target has unexpected hash". /cc @ds300

Answer 8 · 2020-03-17T16:38:13.000Z

Unfortunately I don't think --reverse can help in our case—we don't want to revert the patches because the version of node_modules at the end of the build will be cached but also it is what will be used to run the application, so the patches need to be applied.

I was able to use --reverse in the end. I just have to re-apply the patches after the build has finished but before the application is started.