JamieMason/shrinkpack

Running npm install against a shrinkpacked project pushes the local file resolution into the npm cache.

marxian opened this issue ยท 31 comments

Expected behaviour

It should be possible to rerun shrinkpack if new dependencies have been added to an npm-shrinkwrap file.
...

Actual behaviour

shrinkpack detects that some of the dependencies resolve to tarballs in npm_shrinkwrap/ and exits explaining that it has nothing to do.

I think this is the commit containing the faulty logic: 409062c
...

Steps to reproduce

shrinkpack a project
add a new dependency
try to shrinkpack it again
...

Software versions used

OS         :  Linux
node.js    :  v4.2.4
npm        :  v3.8.6
shrinkpack : 0.10.0

Log output when running shrinkpack

! npm-shrinkwrap.json is already shrinkpacked, update it using npm shrinkwrap --dev then try again

Contents of your npm-shrinkwrap.json or package.json

N/A

Thanks a lot for this, will pick it up on the weekend.

I just want to check I understand correctly, are these the steps you're following?

  1. git clone your/project
  2. npm install
  3. npm shrinkwrap --dev
  4. shrinkpack
  5. npm install something-new
  6. shrinkpack

If so, I think the error is correct and these steps should work.

  1. git clone your/project
  2. npm install
  3. npm shrinkwrap --dev
  4. shrinkpack
  5. npm install something-new
  6. npm shrinkwrap --dev this is the missing step I think
  7. shrinkpack

I'll update the docs to highlight this gotcha but, for now at least, I think it will be necessary that we regenerate npm-shrinkwrap.json before each run of shrinkpack.

This is because having an npm-shrinkwrap.json affects how npm resolves packages (it's this that makes shrinkpack work), but in order to resolve the original tarballs from the registry, any previous runs of shrinkpack need to be replaced.

lime commented

We may have stumbled upon a situation where running npm shrinkwrap still leaves you with a npm-shrinkwrap.json that contains node_shrinkwrap/ (causing shrinkpack to exit):

In a shrinkpacked project, if I empty out my npm cache and install, any later shrinkwrapping will reference file:node_shrinkwrap/... instead of https://registry.npmjs.org/...:

{
  "dependencies": {
    "accounting": {
      "resolved": "./node_shrinkwrap/accounting-0.4.1.tgz",
      "version": "0.4.1"
    }
}
npm cache clean
npm install
npm shrinkwrap --dev
{
  "dependencies": {
    "accounting": {
      "resolved": "file:node_shrinkwrap/accounting-0.4.1.tgz",
      "version": "0.4.1"
    }
}

This is on node 4.4.0 and npm 2.14.20.

This was actually the backstory behind #39, as we were unable to run shrinkpack without first manually removing the offending resolved fields first.

Great, seems that maybe the first time something is resolved by npm, it keeps the reference - including file paths rather than just http.

Nice spot, will need to have a think how best to tackle that and at what times it's needed.

@lime has it.

Running npm install against a shrinkpacked project pushes the local file resolution into the npm cache. This affects later resolution of the same module, even in different, non-shrinkpacked projects.

Here's an attempt at demonstrating the issue, sorry if it is hard to follow:

npm cache clean
mkdir test && cd test && npm init -f
npm install --save leftpad

grep -oP \"_resolved\":\".*?\" ~/.npm/leftpad/0.0.0/package/package.json
"_resolved":"https://registry.npmjs.org/leftpad/-/leftpad-0.0.0.tgz"

npm shinkwrap
shrinkpack
npm install

grep -oP \"_resolved\":\".*?\" ~/.npm/leftpad/0.0.0/package/package.json
"_resolved":"file:node_shrinkwrap/leftpad-0.0.0.tgz"

cd .. && mkdir test2 && npm init -f
npm install --save leftpad
npm shrinkwrap
cat npm-shrinkwrap.json
{
  "name": "test2",
  "version": "1.0.0",
  "dependencies": {
    "leftpad": {
      "version": "0.0.0",
      "from": "leftpad@latest",
      "resolved": "file:node_shrinkwrap/leftpad-0.0.0.tgz"
    }
  }
}

@lime, @marxian โ€“ this is really useful. I think the next steps would be to try and find out more about where npm keeps this information so we can protect/repair it when this happens.

I'll see what I can find, thanks so much for digging out all this info.

I just got bit this myself today. I'd like to help if I can, but I'm not sure I understand exactly what is going on. If someone could provide another explanation I will look into this myself as well. Thanks.

I just ran into this as well. I had a roundabout way of getting around it:

rm -r node_modules npm-shrinkwrap.json
npm cache clear
npm install
npm shrinkwrap
npm install --save new-package
shrinkpack

If @lime's change produces the same end result as those steps, that'd be a much better workaround until this is fixed.

@lime, @marxian โ€“ can you see what I might be doing wrong here? I'm trying to reproduce the issue where a shrinkpacked tarball path leaks into the npm cache. I've seen it happen before, but can't get it to happen. The following is based on @marxian's example, but am I missing something?

#!/usr/bin/env bash

npm cache clean
mkdir test && cd test && npm init -f
npm install --save leftpad
npm shrinkwrap
cat npm-shrinkwrap.json
shrinkpack
cat npm-shrinkwrap.json
npm install
cd ..
mkdir test2 && cd test2 && npm init -f
npm install --save leftpad
npm shrinkwrap
cat npm-shrinkwrap.json

I saved this as reproduce.sh and ran chmod +x reproduce.sh, then ran it with ./reproduce.sh.

lime commented

Interesting. My first thought was that you might need to do something like rm -r node_modules and npm cache clean before the npm install on line 10, to simulate e.g. someone else checking out a shrinkpacked repo and installing it the first time.

However, I ran your script verbatim, and got the expected result:

[...]
leftpad@0.0.0 node_modules/leftpad
wrote npm-shrinkwrap.json
{
  "name": "test2",
  "version": "1.0.0",
  "dependencies": {
    "leftpad": {
      "version": "0.0.0",
      "from": "leftpad@latest",
      "resolved": "file:node_shrinkwrap/leftpad-0.0.0.tgz"
    }
  }
}

Did you get the registry URL in the final output? If so, there must be some differences in our setups. I'm on npm 2.15.3, node 4.4.0 and shrinkpack 0.12.3.

lime commented

Well, with npm 3.8.7 I get "resolved": "https://registry.npmjs.org/leftpad/-/leftpad-0.0.0.tgz", so that probably explains it.

I'll see if I can reproduce the issue on npm 3.

lime commented

Looks like I was on the right track; if you clean the cache and remove node_modules in-between, you should get the file: version on npm 3 as well.

#!/usr/bin/env bash -ex

npm --version
npm cache clean
mkdir test && cd test && npm init -f
npm install --save leftpad
npm shrinkwrap
cat npm-shrinkwrap.json
shrinkpack
cat npm-shrinkwrap.json
npm cache clean
rm -r node_modules
npm install
cat npm-shrinkwrap.json
cd ..
mkdir test2 && cd test2 && npm init -f
npm install --save leftpad
npm shrinkwrap
cat npm-shrinkwrap.json

The script @lime provides above reproduces the issue for me under both:

node 4.4.3 + npm 2.5.1

and

node 5.4.0 + npm 3.8.8

Thanks @marxian, @lime โ€“ I've been on holiday but will pick this up again soon.

I had an idea recently which might help.

You can configure where npm keeps it's cache, so it might be possible to avoid this issue by having shrinkpack set a different cache location while it's running, then restore it afterwards.

The theory is that this could keep the general use npm cache from being polluted.

What do you think @marxian, @lime, @rmacklin, @jonahx?

lime commented

That sounds like it could help. However, as far as I can see, the cache gets polluted at the time of running npm install against the shrinkpacked repo. That's outside the lifetime of the shrinkpack command, so we're not in a good position to change which cache dir gets used. Hm.

lime commented

Of course, we could instruct users to set a project-specific cache. It might even be enough to use the same, separate cache for all shrinkpack projects. It feels really clunky, but could be an ok workaround for those who face this issue.

Similarly, shrinkpacked projects might be ok with not using the cache at all. Maybe that can be achieved by setting cache-max to 0, or cache to /dev/null...? ๐Ÿ˜›

Still, it would be great if we could workaround this issue without users having to add shrinkpack-specific configuration to their package.json.

Any progress on this? I was testing this project out and immediately ran into this situation.

npm install
npm shrinkwrap
# npm-shrinkwrap.json 'from' points to 'package@version'
shrinkpack
rm -rf node_modules
npm install
npm shrinkwrap
# npm-shrinkwrap.json 'from' points to 'node_shrinkwrap/package.version.tgz'
lime commented

@maxkostow I believe all the progress we have made is visible in this issue. As mentioned above, a potential workaround might be found by fiddling with the cache & cache-max settings in package.json, but I haven't tried it out myself.

I'll add a section to the README once I confirm the following is 100% correct, but as @lime said, the fix should be to override the npm cache location and expiry time for project's using shrinkpack. This is so that the main npm cache isn't updated to contain an entry from your own file system.

This isn't a shrinkpack issue per se, but an example of what happens if you install a package from somewhere on your file system using npm, deleting that file, then installing the same package/version pair on some other project. npm will remember that it saw that package/version pair at that path on your file system and blindly trust it will be there, in the same way it should expect it to be if it were coming from a registry.

An example .npmrc in the root of your project;

save-exact=true
cache=node_cache
cache-max=10
cache-min=10

With node_cache being added to your .gitignore.

/dev/null could also work in theory, but I noticed npm cache ls threw errors when I tried that.

Unless I misunderstand the solution, that moves the problem from the global npm cache to a local npm cache. npm installs within the project will still be affected by the issue.

If the project is using shrinkpack, the file paths in that cache should resolve. If you were to delete the tarballs and the npm-shrinkwrap.json as well, as part of a kind of hard reset of your dependencies, you're right, it could happen then.

As part of that reset process you would also need to delete the local project npm cache.

It's still an npm gotcha we have to live with, but is hopefully more predictable than unrelated projects being affected.

I don't know, this is the best I can think of so far. Hopefully we can figure out a better workaround but it is an npmism ultimately, so is tricky to encode into the shrinkpack tool.

Any thoughts all?

I did misunderstand, my apologies. I was conflating the contents of the npm-shrinkwrap.json with the contents of the cache. It's not a problem that on subsequent npm shrinkwraps the node_shrinkwrap files get written to the npm-shrinkwrap.json and if you are using a local cache, node_shrinkwrap will not be written to the global cache.

lime commented

I tried this out using the reproduction script above. Setting a project-specific cache does indeed seem to help, while setting both cache-min and cache-max to zero had no effect at all.

I can confirm that the minimum setup required is to add an .npmrc with cache set to a project-specific directory, and .gitignore that directory.

# .npmrc
cache=node_cache
# .gitignore
/node_cache

Personally, I think I'm going to stuff it in tmp/npm_cache. :)

And yes, it seems like this is the best we can do at the moment. It's frustrating that there's a need for this kind of manual setup, but it's difficult to do anything about it on the shrinkpack side.

I tried creating a preinstall script that would generate the .npmrc before install (and delete during postinstall) but I think npm loads the config at startup and does not reread the new .npmrc on subsequent commands so couldn't have a transient "cache".

I also ran into issues with the "callback called more than once" npm error when running on CI so I will be sticking with npm_lazy on CI for now.

Hopefully we will be able to reliably commit deps with npm@4 someday...

I also ran into issues with the "callback called more than once" npm error when running on CI

Please can you raise an issue for this? i can try and fix it, thanks,

Please can you raise an issue for this? i can try and fix it, thanks,

It's during npm install so I don't believe it has anything to do with the shrinkpack itself.

OK no worries, thanks.

@JamieMason Would you accept a PR (from myself or someone else) to the readme for the time being with a warning about the npm client's cache bleeding between projects, and the current work-around?

of course, thanks @DrewML.

I'm closing this as there's some detail added to the README now, but please feel free to reopen and comment if more should be done.