npm/cli

[FEATURE] Do not remove node_modules on npm ci

Zenuka opened this issue Β· 88 comments

What / Why

I would really like to see a flag like npm ci --keep to do an incremental update on our buildserver as this would speed up our deployments a lot. Like suggested before on github and on the community. The last update was on the 7th of October that this was being reviewed by the cli team. Could someone post an update on this? :-)

This is not what ci / cleaninstall is meant to do. The current behaviour is correct. What you want to use is npm shrinkwrap.

We added an update to avoid deleting the node_modules folder but not its contents (as originally requested on that post). The npm ci command purpose is to delete everything to start from a clean slate. If you want to keep your old node_modules what you need is npm i.

Thanks for your replies! Sorry for my late reply. I've looked at npm shrinkwrap but is this intended to run on our build server for continuous integration? When running this command it renames my package-lock.json to npm-shrinkwrap.json but what should I then run during CI? Just npm install to have an incremental update? Or should I run npm ci but that will delete all packages again :-( What I'm looking for is a command that does an incremental update but will install exactly what is in our package-lock.json

@claudiahdz; My understanding is that running npm install during CI will update the package-lock.json and that could mean that running the same build a couple of weeks later would install different packages. Is that incorrect?

P.s. I thought npm ci was short for Continuous Integration

As referenced here: npm/npm#20104 (comment)

The current behavior is problematic if you're using npm ci inside of a Docker container (which is quite common for Continuous Integration) and you have a bind mount on node_modules

It causes the following error:

webpack_1   | npm ERR! path /var/www/project/docker-config/webpack-dev-devmode/node_modules
webpack_1   | npm ERR! code EBUSY
webpack_1   | npm ERR! errno -16
webpack_1   | npm ERR! syscall rmdir
webpack_1   | npm ERR! EBUSY: resource busy or locked, rmdir '/var/www/project/docker-config/webpack-dev-devmode/node_modules'

which then results in aborting the Docker container.

It'd be lovely to have a --no-delete flag or if npm ci could delete the contents of node_modules but not the directory itself.

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

It'd be lovely to have a --no-delete flag or if npm ci could delete the contents of node_modules but not the directory itself.

rm -rf node_modules/* && npm i

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

...because: https://docs.npmjs.com/cli/ci.html

This command is similar to npm-install, except it’s meant to be used in automated environments such as test platforms, continuous integration, and deployment – or any situation where you want to make sure you’re doing a clean install of your dependencies. It can be significantly faster than a regular npm install by skipping certain user-oriented features. It is also more strict than a regular install, which can help catch errors or inconsistencies caused by the incrementally-installed local environments of most npm users.

The faster installs and clean slate approach make this ideal for CI environments such as the one I mentioned above.

rm -rf node_modules/* && npm i

This is what I do now, but see above for the desire to use npm ci

It seems reasonable to me to file an RFC asking for a config flag that makes npm ci remove the contents of node_modules and not the dir itself. This is also an issue for me, in that i've set up Dropbox to selectively ignore a node_modules dir, but if i delete it, that selective setting goes away, and the next time node_modules is created, it syncs.

It seems reasonable to me to file an RFC asking for a config flag that makes npm ci remove the contents of node_modules and not the dir itself. This is also an issue for me, in that i've set up Dropbox to selectively ignore a node_modules dir, but if i delete it, that selective setting goes away, and the next time node_modules is created, it syncs.

Isn't this also what another issue described, to allow npm to create files to ignore the dir (for OSX spotlight and others)? I think there were also others who need this feature.

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

npm i would be a great but only if it wouldn't change the lock file. I've seen the package-lock.json been updated during an npm i or should that not happen?

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

same, a flag would be great

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

Why not add a flag for npm i then? Because this would make not much sense for ci = clean install in my sense.

What part of "clean install" is incompatible with keeping the parent node_modules/ directory intact (while doing a clean install of the actual contents)?

I realize that CI doesn't stand for Continuous Integration in this case; but a Clean Install is often quite useful in a Continuous Integration environment, as the documentation makes clear.

This command is similar to npm-install, except it’s meant to be used in automated environments such as test platforms, continuous integration, and deployment – or any situation where you want to make sure you’re doing a clean install of your dependencies. It can be significantly faster than a regular npm install by skipping certain user-oriented features. It is also more strict than a regular install, which can help catch errors or inconsistencies caused by the incrementally-installed local environments of most npm users.

npm ci is specifically mean to be used in automated environments, many times this means a Docker-based setup.

The behavior of deleting the node_module/ directory is troublesome in a Docker-based setup, for the reasons mention in this thread.

So we're asking for an option that will make this command useful for its intended purpose and environment.

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

Why not add a flag for npm i then? Because this would make not much sense for ci = clean install in my sense.

I have to ask this question are their any other differences between npm install and npm ci if not then why aren't both options available in npm install maybe ci needs to become some alias like npm install --no-update-package-lock --clean-node-modules

The behavior of deleting the node_module/ directory is troublesome in a Docker-based setup, for the reasons mention in this thread.

In my opinion this should only happen once when the image is built. After that npm i should be used during development.

maybe ci needs to become some alias like npm install --no-update-package-lock --clean-node-modules

Personally that makes more sense to me, additional flags for the normal npm i command.

I'm indifferent to which, and honestly too much of a n00b with js land to have a concrete argument that it must be ci, all I know is that it should not update the package-lock.json and should not remove node_modules

npm cidoes not update the lockfile, it installs from the lockfile. This was introduced to do a clean install because prior this people were advised to rm -rf node_modulesand run npm i again. And afaik people wanted that it does not change the lockfile but installs from it.

So npm ci was born. And it also skips some things like the list of the installed packages and the tree and a few more things.

See https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

It covers a specific use case.

For other use cases we should add new flags to npm iwith which we can also emulate npm ciwhich is a more flexible and better solution than flags for npm ci which should still cover only the current use case imho. What users request here is a bit similar to yarn install --frozen-lockfile or yarn --frozen-lockfile.

Otherwise flags are spread over npm ci, npm iand so on which makes it a bit more difficult (documentation, code, ...). At least this is what I think. Let's put it to npm i t have more powerful and flexible ways to configure its behavior.

For other use cases we should add new flags to npm iwith which we can also emulate npm ciwhich is a more flexible and better solution than flags for npm ci which should still cover only the current use case imho. What users request here is a bit similar to yarn install --frozen-lockfile or yarn --frozen-lockfile.

I'd be very happy if the feature was added to npm i. Should I update the original post?

npm cidoes not update the lockfile, it installs from the lockfile. This was introduced to do a clean install because prior this people were advised to rm -rf node_modulesand run npm i again. And afaik people wanted that it does not change the lockfile but installs from it.

So npm ci was born. And it also skips some things like the list of the installed packages and the tree and a few more things.

See https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

It covers a specific use case.

For other use cases we should add new flags to npm iwith which we can also emulate npm ciwhich is a more flexible and better solution than flags for npm ci which should still cover only the current use case imho. What users request here is a bit similar to yarn install --frozen-lockfile or yarn --frozen-lockfile.

How is rm -rf node_modules/* not qualifying as "cleaning" ? The feature asked here is very similar to the one present in npm ci. In my opinion it makes more sense to add a flag to npm ci so it uses rm -rf node_modules/* instead of rm -rf node_modules instead of importing the entire behavior of npm ci into npm i.

BTW this issue should get more attention and maintainers should voice their opinions and plans about that, using docker is basically always used in CI (continuous integration) which is one of the main use case of npm ci !

Please open an RFC for this change, rather than an issue in this repo.

To avoid confusion, I’d rename this issue as β€œempty instead of remove node_modules dir in npm CI”

My intention of this issue was never to delete the node_modules folder or only it's contents. It was always to preserve the contents of node_modules but make sure it's up to date and in sync with package-lock.json. So an incremental update which adheres to the package-lock.json.

Maybe I'm wrong but I feel there are two issues here. Maybe someone could start another issue or RFC about deleting only contents of node_modules instead of deleting the folder completely? Or am I missing something?

@Zenuka the entire reason npm CI is fast, and exists, is because it ignores the existing node_modules dir, so it’s pretty unlikely that will change.

In our use case, I think it would be faster just to check if the nodes_modules folder is up to date or not. And if it's not, only update the packages that should be updated (like npm i does) I have some dedicated VM's running as build agents so running a build and keeping the nodes_modules folder and all it's contents should be faster then deleting everything and re-installing it. We run our build and tests for code changes a lot more than changes to our package.json or package-lock.json.

In our use case, I think it would be faster just to check if the nodes_modules folder is up to date or not.

Well, this (the calculation of the package tree) is what takes the most time. This check would make npm ci really slow.

running a build and keeping the nodes_modules folder and all it's contents should be faster then deleting everything and re-installing it.

Probably not, that's why npm ci was introduced which skips what npm i does (check the package tree).

@Zenuka npm install already is the fastest possible way to do what you want. npm ci has only one purpose: do it faster, by deleting node_modules so it doesn't have to compute a diff.

Probably not, that's why npm ci was introduced which skips what npm i does (check the package tree).

I've tested this only on my machine (which is of course not a good measure) but running npm install on an up-to-date node_modules folder finishes within 10 seconds. Running npm ci takes minutes. Would you expect different results?

I'm a fan of your suggestion to add a flag to freeze the lock file with npm install.

Verifying that what’s in package-lock.json is actually present is super fast, even on Windows. See https://github.com/fuzzykiller/verify-node-modules.

Verifying that nothing else is present in node_modules would certainly take a little longer but probably still less than a second.

On this basis, an incremental version of npm ci could easily be created. The tree is already calculated and saved to package-lock.json, after all.

Also, basically the only reason npm ci exists is to install what’s in package-lock.json. Without sneaking in surprise upgrades, like npm install does.

just my 2 cents, i personally switched our infra over to npm ci as i was also sick of on deploy of an old tag npm i would not adhere to the lock file... so if its seriously that big of an issue to add the flag at the npm ci level (which i get.. its clean install its doing what its told) then npm i REALLLLLYLYY needs this flag. but i remember researching this and there was also an issue thread on the npm i that was like over 2 years old (and still open) where the npm team suggested people use npm ci lol... this is kinda why people have given up on npm in the past and just gone to yarn.

again just another devs perspective

I put my vote for adding the possibility to keep the modules βž• .

+1 here - as @phyzical and @fuzzykiller said, there's no "sweet spot" between npm install and npm ci that will KEEP node_modules, but still respect package-lock.json and run faster.
Just run as fast as possible - look for dependencies from package-lock that already exist in node_modules, and then install everything else missing.. no upgrades, no deleting.

Personally I don't care which one it is (install or ci) that would have this, but all of this sounds like npm install should just have flags for everything and npm ci doesn't need to be a separate command.

This is somewhat frustrating, given that npm ci was originally touted as the solution to the same problem this issue is raising.

The original behavior that a number of people wanted for npm install was to look at the package-lock.json instead of package.json. We wanted a flag on npm install to turn that behavior on. What we got instead was npm ci, because:

the package.json describes the required dependencies of your project. If the current lock file cannot satisfy those, the lock file has to yield. The purpose of the lockfile is to create a repeatable installation across different machines, not to obsolete the package.json.

So, fine. npm install isn't the right place for that option, npm ci is. Except npm ci adds additional behaviors (clearing out the node_modules folder) that keep it from being a useful solution to the original problem. And the reason there can't be a flag on npm ci now is because:

ci = clean install

This is expected. Why don't you use the normal npm i with a lockfile?

Which... fine. I don't really care where the flag gets added. I don't have any stake in the underlying philosophy behind the interface. But could the flag please be added somewhere?

Heck, I wouldn't raise objections even if people wanted an entirely separate 3rd command, I couldn't care less. The only thing I care about is that 3 years after this conversation about respecting package-lock.json for normal installs got started, there's still no way to get the behavior that we were originally asking for.

At my workplace we've seen bugs from minor and bugfix version updates for packages. We really only want to be looking for those bugs during purposeful package upgrades, we don't want our dev environments to be using different package versions than our production environments. Consistency there is very important. Whatever anybody wants to call it or wherever anybody wants to put it, we want a fast way to get packages from the lockfile that also won't require us to sit through node-gyp builds for already-installed modules every time we run the command.

This is how I would like it to work in a perfect world :

  • npm install - same behavior as today
  • npm install --from-lockfile - install from the lockfile (like ci does)
  • npm install --clean - same behavior as npm install but delete the node_modules content
  • npm ci - an alias to npm install --from-lockfile --clean

@jdussouillez This is exactly what should happen. Very well said! I'd love to see this solution put in place.

It is consistently frustrating to run into this issue where we have to decide between speed and consistency for a CI pipeline. I've run into it 3 or 4 times for different reasons in the last 2 months alone.

This feature would be great for Azure Pipelines and other cloud architectures.

https://docs.microsoft.com/en-us/azure/devops/pipelines/release/caching?view=azure-devops#tip

Because npm ci deletes the node_modules folder to ensure that a consistent, repeatable set of modules is used, you should avoid caching node_modules when calling npm ci.

Closing: As @claudiahdz mentioned, we shipped a fix to this behavior where npm ci does not remove the node_nodules folder itself anymore but only it's contents (ref. https://github.com/npm/libcipm/blob/latest/CHANGELOG.md#407-2019-10-09). This was shipped in npm@6.14.7 back on July 21st (ref. https://github.com/npm/cli/blob/v6/CHANGELOG.md#6147-2020-07-21) & we've maintained the same experience in npm@7.

If you have a separate issue with npm ci or any other command, please use one of our issue templates to file a bug: https://github.com/npm/cli/issues/new/choose


Side notes...

@jdussouillez appreciate the feedback; In terms of installing directly from a lockfile - you can do that today with the flag --package-lock-only (ex. npm install --package-lock-only). In terms of adding a --clean flag to install, I don't feel like this adds much value but I could be wrong. If you feel strongly about it, we'd love to have you submit an RFC over at https://github.com/npm/rfcs

The comment made by @claudiahdz almost a year ago seems to be related with making sure the npm ci behavior is to delete the node_modules content, instead of the folder itself. Which is handy when mounting it into a docker container (for example), but still doesn't change the end result - npm ci will download all the dependencies from scratch.

Using npm install --package-lock-only seems to be doing the exact opposite of what the original issue is about (If I understand correctly) - it will only update the package-lock.json file, and will not download any dependencies.

What I understand from the original issue, is the need to have an option that gets a current state of the node_modules folder, and a package-lock.json file, and downloads only the required packages to get the node_modules versions to match the package-lock.json. So it will be much faster than downloading everything every time, with the same net result at the end.

Isn't that what npm install already always does?

Isn't that what npm install already always does?

AFAIK -
npm install will resolves all the dependencies according to the package.json file (ignoring the package-lock.json), compare with what is currently in the node_modules folder, and download the dependencies that need to be downloaded to match the requirements. It will also update the package-lock.json accordingly.

It definitely does not ignore the lockfile - it just takes into account the existing tree, which npm ci does not.

You are correct, I am sorry.
I remembered incorrectly, (maybe that was the behavior in the past?). Just did some testing with a simple dep tree, and when the package-lock.json file is present, npm i install exactly the versions it specifies, and does not change anything. This was just the behavior I was looking for, so I'm happy with it. πŸ‘
I apologize for posting on a closed issue.

My original request was indeed what ATGardner describes:

What I understand from the original issue, is the need to have an option that gets a current state of the node_modules folder, and a package-lock.json file, and downloads only the required packages to get the node_modules versions to match the package-lock.json. So it will be much faster than downloading everything every time, with the same net result at the end.

My experience with npm install is that it sometimes updates the package-lock.json file. I tested this again this morning with a repository which I hadn't updated in a while and ran git pull and npm i. It didn't actually update any versions this time, just added some dependencies and extra packages.
image
Unfortunately this is a private repository but maybe someone else as a reproducible public repository? Where there are multiple commits and switching between them causes npm install to update the package-lock.json?

I realize there could be some user error involved when not commiting the package-lock.json when updating the package.json but my colleagues know that they should update the package-lock.json as well. I'll look into this.

I couldn't get my simple example to have npm i change the package-lock.json file. But I will try it out some more.

If npm i always ends up downloading the exact same versions specified in the package-lock.json, while keeping as much as it can from the current node_modules, why would I ever need to run npm ci? what would be the benefit of deleting everything before downloading again?

I apologize again for this not being the place for this discussion. Is there anywhere else more preferable?

I still don't understand. If the state of node_modules after running npm i exactly matches the package-lock.json, and the state of node_modules after running npm ci has the exact same end result - in almost all scenarios, assuming the computer you are building on already has some/most dependencies in the folder, wouldn't npm i will be faster? It will just not download what is already present locally, and matches the required version.

Why would I rather delete and download everything from scratch?

No, npm ciis still faster as it does not check the deptree again, some console output is not done.

Why would I rather delete and download everything from scratch?

To prevent issues and ci is for specific environments like deployments.
I think the docs already mention the differences.

It can be significantly faster than a regular npm install by skipping certain user-oriented features. It is also more strict than a regular install, which can help catch errors or inconsistencies caused by the incrementally-installed local environments of most npm users.

See also https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

npm ci is still faster.

So when using npm i, the time it takes to read the current node_modules, and figure out which packages should be downloaded, is significantly larger than the time it takes to actually download all the packages from npm's servers? I'd love to see an actual experiment that measures it.

And I also don't understand this paragraph -

npm ci bypasses a package’s package.json to install modules from a package’s lockfile. This ensures reproducible buildsβ€”you are getting exactly what you expect on every install.

Haven't we just concluded right here that running npm i uses the exact versions in the package-lock.json file, and the state of node_modules after the run is identical to the state it would be after running npm ci? So the builds will be just as reproducible.

UPDATE:

I have made the following test -
I created a new create-react-app project. After it completed it's initialization, it had a package.json with 7 direct dependencies, and a package-lock.json that contained 1982 packages.
At this state (node_modules contains all dependencies) - running npm i takes

real    0m2.548s
user    0m2.659s
sys     0m0.182s

When I deleted a single package folder (node_modules/babel-eslint), and then ran npm i again, it took

real    0m3.295s
user    0m3.543s
sys     0m0.434s

to re-download the missing dependency

When I deleted the entire node_moduels folder, and ran npm i again, it took

real    0m16.701s
user    0m19.251s
sys     0m10.379s

When I ran npm ci, it took

real    0m20.997s
user    0m23.844s
sys     0m14.857s

This did not differ by much when I tried removing a single package, or even deleting the entire node_modules folder manually before the call. It wasn't surprising, since npm ci starts by deleting the content of node_modules anyway.

After every run, I ran diff -q -r node_modules_orig/ node_modules/ to make sure the result is identical to the original dependencies. It always was.

So to conclude - it seems that using npm ci takes ~21 seconds on my machine, regardless of the current state of node_modules. Using npm i on a recently cloned project (no node_modules) takes ~18 seconds, and running it on a project that has no changed dependencies (the current node_modules matches the required dependencies) takes ~3 seconds.

So when would using npm ci be preferable? It doesn't seem faster (though of course, this is just a single test), and the end result is identical to npm i, so the subsequent build would be just as reliable.

npm ci is preferable when you need exactly what is in package-lock.json and nothing but. npm i does not guarantee that it will install exactly what is in package-lock.json. This is by design. While package-lock.json is an input to npm i, it is also an output.

I believe there are only a few cases left where npm i would install something different (and thus modify package-lock.json), like maybe package versions that were soft-deleted.

Back when npm ci was first introduced, npm i either ignored package-lock.json outright or at least was a lot more proactive at installing different versions.

Either way, it doesn’t really matter. npm ci is only OK when the node_modules folder doesn’t exist yet. Otherwise it is prohibitively slow, especially on Windows. So npm i simply needs a flag that guarantees it will not modify package-lock.json and install exactly what is inside package-lock.json.

I don’t see any point in further discussing the why and how. Either we’ll get it or we won’t. As is, npm ci sucks.

/update:
Here’s a repo where running npm i will change package-lock.json: https://github.com/fuzzykiller/npm-install-demo

Though the changes are only technical in nature, they’re still not acceptable.

Just to quickly reiterate:

  • npm ci always deletes the content of node_modules by design, which is undesirable for non-production builds because it's slow. However, it uses exact versions of packages found in package-lock.json, which is desirable for multiple situations.

  • npm install just updates the contents of node_modules, which is very performant, but by design it ignores the contents of package-lock.json if package.json version numbers differ, which is undesirable for multiple situations.

  • npm install --package-lock-only is described in the docs:

    The --package-lock-only argument will only update the package-lock.json, instead of checking node_modules and downloading dependencies.

    This does not seem useful for any of the scenarios described above.

What people have been asking for during the past 3 years:

  1. A command (anywhere) that will ignore package.json and only respect package-lock.json as the definitive source of what packages will be installed.

  2. That will not delete the entire contents of node_modules and re-download everything from scratch.

As far as I can see from both the docs and local testing, npm install satisfies point 2, but not 1. npm ci satisfies point 1, but not 2. npm install --package-lock-only satisfies none of those requirements.

I'm not completely sure why this issue has been closed, there's still no way to get the desired behavior.


Edit: To extend off of @fuzzykiller's example, it's not just that package-lock.json gets updated. That would be annoying, but it wouldn't break any of my builds. But if package.json has fuzzy dependencies listed anywhere, and a bugfix version of those dependencies get released, they'll get changed when I run npm install on a new machine. Suddenly I have install differences between two machines. We've run into bugs at my company from exactly this behavior, it's not just that the package-lock.json needs to be checked into Git again.

It is desirable in that situation to have a command that behaves like npm ci -- that makes a reproduceable install based only on the contents of package-lock.json. However, deleting the contents of the node_modules folder slows down builds too much for some environments and situations, even though it's appropriate behavior for a final production build.

There could be a flag anywhere to address this problem. It could be npm install --from-lockfile. It could be npm ci --preserve-existing. But right now it seems like we're in a circle where anyone who asks for a flag to get added to npm install gets pointed at npm ci as the solution, and anyone who asks for a flag on npm ci gets pointed at npm install as the solution. This issue was closed pointing at npm install --package-lock-only, but that flag is almost the opposite of what people are asking for. It doesn't respect package-lock.json as the authoritative source, and it also doesn't update or install any of the dependencies in the node_modules folder :)

This issue should be reopened.

Let's imagine the following use case where I want to:

  1. build the application on CI,
  2. have fixed deps versions to keep things stable,
  3. cache node_modules between particular steps of CI pipeline (build/test/deploy) with package-lock.json as a cache key.

Given that particular CI jobs can be run on different runners, the CI cannot guarantee that cache will always be available.

The scenario above is quite popular but npm doesn't support it: 1 is achievable both by npm i and npm ci, the later is advised, 2 is achievable only with npm ci but 3 can be done only with npm i.

@darcyclarke considering the above I believe this feature request shouldn't be closed as it makes sense to utilize existing node_modules AND have the npm ci features available.

npm install changes package-lock.json, doesn't delete node_modules
npm ci doesn't change package-lock.json, deletes node_modules

There is NO way to do an install that doesn't change package-lock.json and doesn't delete node_modules.
This is yarn's default behavior.

See how many people is asking for this feature or confused by this behavior https://stackoverflow.com/questions/45022048/why-does-npm-install-rewrite-package-lock-json

It's been three years and there is still no solution, very disappointed.

@darcyclarke

There should be a way to run npm ci -flag or other npm magic that installs only missing packages or re-installs packages which version does not match package-lock.json. "to do an incremental update"

npm install --package-lock-only does not install packages" unlike what you mentioned...it only updates package-lock.json. While I am sure you know this, modifying package-lock.json is out-of-scope for this feature request and not really helpful to bring up. And "I would like to see a flag like npm ci --keep to do an incremental update" is not correlated to "behavior where npm ci does not remove the node_nodules folder itself anymore but only it's contents".

Is it possible to reopen this to fulfill the original request? npm/feedback#110

I'm surprised nobody appears to have mentioned caching ~/.npm in this thread; it really seems to be the best solution to this problem.

It's shown as an example in the npm docs, but it's not called out as a best practice; I think it should be. If the desired outcome is speed and reliability, rebuilding node_modules from a local NPM cache would surely be the fastest solution, wouldn't it?

  • Only new packages/versions need to be downloaded.
  • The existing node_modules directories do not need to be analyzed.
  • No files need to be copied; NPM just hard links each package in the cache.

So if a CI environment caches ~./npm and uses npm ci, it seems to provide everything everyone is asking for. I only learned this today; I wish I had learned it before.

I'm now using this solution in a monorepo, and the results (via GitHub Actions) are great:

lerna info ci enabled
lerna info Bootstrapping 3 packages
lerna info Installing external dependencies
lerna info Symlinking packages and binaries
lerna success Bootstrapped 3 packages
added 1040 packages in 10.332s
  • No files need to be copied; NPM just hard links each package in the cache.

This would be new to me, because so far npm still copies files into node_modules - it's far from what pnpm does.

In your case lerna does the symlinking stuff, not npm ;-)

See:
https://www.npmjs.com/package/@lerna/symlink-binary
https://github.com/lerna/lerna/blob/main/utils/symlink-binary/symlink-binary.js

Lerna itself is a monorepo tool which can use npm or yarn as npm client.

@aaronadamsCA In my use case, saddly the angular-cli produces artifacts that are stored in node_modules at project level. So the naive solution would be to cache that folder, which leads to the problem described in the thread. So yeah, ~/.npm is not a solution for all cases

I'm surprised nobody appears to have mentioned caching ~/.npm in this thread

This would be new to me, because so far npm still copies files into node_modules

If it does actually symlink, then I would still consider that to be a suboptimal solution. On a local computer it would have the side-effect of tying together the installs of multiple projects, and it means that I'm no longer thinking about my project code in a self-contained way. I don't like the idea that I might temporarily mess with a dependency while debugging one project on a work computer and have that code show up in another project because they're all symlinking from the same files. I want my separate npm projects to actually be separate.

But more to the point, regardless of whether files are symlinked or copied -- if we're comfortable reusing data from the home directory during an npm ci command, then why are we not comfortable reusing data from the node_modules folder? What's special about a project repo's node_modules folder that means it has to be cleared out on ci command if the ~/.npm folder doesn't have that same restriction?

Until reading these comments I assumed that npm ci ignored any system-wide caches, or at least it only used them as a verifiable cache for specific downloaded files and not for anything else. If that's not the case, then I don't understand why earlier discussions were so vehement that ci had to be a clean install. If @aaronadamsCA is correct, then it's already not a clean install, so why not go a step further and have a flag that preserves node_modules?

This would be new to me, because so far npm still copies files into node_modules - it's far from what pnpm does.

In your case lerna does the symlinking stuff, not npm ;-)

My mistake, I blame late night brain! I was looking at the ls column that counts hard links - for files. For directories, obviously it just counts the number of directory entries. πŸ™ˆ

If @aaronadamsCA is correct

@nmm-shumway, I'm not! (Although npm ci is plenty fast for me now that I cache ~/.npm in CI.)

It's surprising to see that many of us are waiting for the same need, that has been very clearly expressed in this thread, and still no solution after all that long.
For now i chose to use npm ci to honor the version of npm packages defined in package-lock.json, but as npm ci deletes the node_modules every time, this step is by far the most IO intensive in my CI pipeline.
I'm considering to update the pipeline to compute a checksum of package-lock.json, and run npm ci only when the checksum changed since last build, but this looks like a hack on top of something that should be handled natively by npm.

Can we please re-open this issue @darcyclarke ? Or would you prefer I create a different issue for the functionality described above by nmm-shumway: #564 (comment)

It's definitely needed,
I have a dev dependency that uses chromium, and every time I run npm ci it re-downloads it again (~118 MB)

What worked for me:
1- Run npm ci initially
2- Link a module (normal dependency)
3- Run my app to check a behaviour
4- Run npm ci to stop using the linked module -- removes node_modules and downloads all modules from npm registry
4- Run npm i --production which will NOT download dev-dependencies

I imagine the new flag being able to:

  • Detect the needed modules
  • Check the already installed modules in node_modules
  • Only install/remove a package if its version in node_modules and the main package.json are different, OR if the package is completely missing from node_modules

Here's the code i integrated in my jenkins pipeline to only run npm ci when package-lock.json has changed since last build

def packageLockFile = 'package-lock.json'

// package-lock.json is mandatory
if (! script.fileExists(packageLockFile)){
    throw new JenkinsException(status: 'FAILURE', stage: 'npm install', "Could not find ${packageLockFile}")
}

def packageLockChecksumFile = '.package-lock.sha'
def canSkipNpmCi = false

// if the checksum file and node_modules exist, check that the checksum of package-lock.json has not changed since the last build
// if so, npm ci can be skipped
if (script.fileExists(packageLockChecksumFile) && script.fileExists('node_modules/')) {
    def res = script.sh(script: "sha512sum --status --check ${packageLockChecksumFile}", returnStatus: true)
    if (res == 0) {
        canSkipNpmCi = true
    }
}

if (! canSkipNpmCi) {
    script.nodejs(pipelineConfig.nodejsVersion) {
        script.sh("npm ${pipelineConfig.npmParameters} ci")
    }

    // update the checksum file
    script.sh("sha512sum ${packageLockFile} > ${packageLockChecksumFile}")
}
else {
    script.echo('The checksum of package-lock.json has not changed since last build, so skipping npm ci')
}

I'd add that npm ci is significantly slower even when done repeatedly, with a local NPM cache, so it can't simply be used in its current form to prevent package-lock changes (and installing potentially different dependencies each time). For example, npm install is about 2s, npm ci 16s when run for the same project of ours. After a cache clear they're the same. We desperately need something that would install from package-lock but don't start by deleting node_modules.

I am in a dev environment and need to install, but I want to install exactly what is in the package-lock (because this is a stable product) and I have to do this multiple times a day as I switch branches/pull in changes from our monorepo.

In my case I am not worried about junk in node_modules, I just want to get all the npm packages that will make things work and I want them on the exact version as in the lockfile. If I have to do ci, it is throwing away existing stuff and re-downloading it. As @Piedone points out, thats a significant time hit.

So what is the correct workflow here? If it is not ci then what is it? npm install no longer has an option for installing from package lock, so what are you supposed to do?

It doesn’t re-download anything - it just re-extracts it, which takes very little time. npm ci is indeed what you should use.

npm install does install from package-lock, it just doesn’t remove node_modules first, and takes β€œwhat’s installed” into account.

My quick test explained under #564 (comment) shows that this re-extraction takes a lot of time, making npm ci 8x slower than npm install (on a high-end PC with a 500 Mb/s connection).

npm install does install from package-lock

npm install does not install only from the package-lock.json. It takes it into account, but also bases itself off of package.json dependencies. You can see this explained in more detail at npm/npm#18103.

You can also try it out yourself to verify. Run npm install with an outdated package in your package.json. If you have a "fuzzy" package-numbers specified (~ or ^), your package-lock.json will be overwritten and a newer version of the package will be installed.

That's what this bug is about: adding some kind of flag somewhere to allow respecting package-lock.json without deleting the entire contents of the node_modules folder. There seems to be some confusion in general over whether or not npm install already respects the package-lock.json as the final decider of what packages/versions to install; I'm seeing multiple comments across multiple issues making this mistake. But it doesn't, or at least it hasn't since 2017.

I'll also add onto @Piedone's comment that I agree that npm ci is, in practice, much slower than npm install. I don't know what the reason for that is behind the scenes: if it's that extracting from the cache takes more time, or if it's (I suspect) because many modules have extra compilation steps with node_gyp that need to be run every time they're updated, or if it's some kind of default configuration issue that needs to be solved or better documented... I can't speak to that part.

But whatever the reason, internally at my org we dropped using npm ci for routine non-production builds specifically because it was so slow that it was not feasible for us to use it. Instead, we hard-pin every single dependency in our package.json file because there is no other way to get the behavior we want. This obviously makes upgrading packages much more cumbersome, which is why I'm commenting here on this issue.

I vaguely suspect (but am not sure) that when people say that npm ci is faster than npm install they're talking about comparing the install times for an empty node_modules. But that's not really relevant, because the whole point of using npm install instead of npm ci is that you don't need to reinstall everything. Whatever the cause, in practice, in the real world, it's inaccurate to say that npm ci will be faster than npm install for most people if they're running the command on a repo that's already set up and that already has some or most of its dependencies installed at the proper version.

I'm not 100% certain what we're debating about, it seems to me that it's trivial to verify that reinstalling every one of your packages (many with pre-install or post-install scripts) will be slower than installing 2 or 3 of those packages and leaving the rest untouched. If anyone has a flag or option that they think will change that, I'm happy to test on my system to see what the result is.

Hmm, is that so though @ljharb ?

➜  myproject_client git:(develop) npm ci
...

added 1100 packages, and audited 1101 packages in 8s

67 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities


➜  myproject_client git:(develop) git status
On branch develop
Your branch is up to date with 'origin/develop'.

nothing to commit, working tree clean


➜  myproject_client git:(develop) npm install          

up to date, audited 1101 packages in 2s

67 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities


➜  myproject_client git:(develop) βœ— git status
On branch develop
Your branch is up to date with 'origin/develop'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   package-lock.json

no changes added to commit (use "git add" and/or "git commit -a")

Looks like this is not the case.

This is npm version 7.9.0

Added a new feature proposal in the feedback repo, and wrote up a quick implementation of a --from-lockfile option for npm install.

It's very possible I'm missing something about npm's internals or that I messed something up somewhere, but annecdotally I'm seeing about a 4x speedup between npm install --from-lockfile and npm ci. Again, that's just on my system though, and it's assuming my implementation is correct, which might not be the case. I'd appreciate someone else taking a look to see if anything jumps out as obviously wrong. node lib/npm install --from-lockfile on a local clone is what I'm using to test.

This is loosely based off of @jdussouillez's comment and @DanielRuf's comment that both encouraged adding a new flags to npm install instead of npm ci. I didn't go all the way and add additional flags beyond --from-lockfile or try to alias npm ci because I'm just looking for the smallest possible change that will solve the problem.

I'm pretty new to the node / js world, so I'll just say that I'm surprised this is still a multi-year discussion.

What I expected when I started using projects in node was this:

  • npm i - This should install exactly what's in the package-lock.json if present. If not, it should build from package.json, and create a package-lock.json.
  • If I wanted to update the locks, I would delete the package-lock.json, or use an optional parameter to force it to update the locks.
  • If I wanted a fresh install, I would delete the node_modules directory myself. On a CI system, I would expect the default scenario to not have a node_modules present, so therefore it would build everything in the lockfile fresh without any risk of a "dirty" install.

This would give me one command that behaves exactly like we want in multiple situations, and if you do NOT want a consistent environment, you could eliminate the package-lock.json or not commit it to version control. Maybe even have a parameter in your package.json:

  • "lock-behavior": "nocreate" - This would never create a lock and build only from package.json
  • "lock-behavior": "update" - This would give the current npm i behavior and update the lock file
  • "lock-behavior": "fixed" - This would be what we're asking for, only build what is in the lock file
  • "lock-behavior": "delete" - Similar to nocreate but would delete lock files if present as well

Further, we could then have a command line parameter that could override the package.json - lock-behavior config --lock-behavior update

There are multiple problems with the current state, since some combinations of things are just unavailable, but also developers need to think about what their intentions are when they run the command. Having it configurable in the package.json would make it so every developer in the project gets the same behavior when they just type npm i. Different teams and companies and projects could all make different decisions, and wouldn't have to hope that people scour the README.md to make sure they know which commands to use to properly build the project.

Here's the code i integrated in my jenkins pipeline to only run npm ci when package-lock.json has changed since last build

def packageLockFile = 'package-lock.json'

// package-lock.json is mandatory
if (! script.fileExists(packageLockFile)){
    throw new JenkinsException(status: 'FAILURE', stage: 'npm install', "Could not find ${packageLockFile}")
}

def packageLockChecksumFile = '.package-lock.sha'
def canSkipNpmCi = false

// if the checksum file and node_modules exist, check that the checksum of package-lock.json has not changed since the last build
// if so, npm ci can be skipped
if (script.fileExists(packageLockChecksumFile) && script.fileExists('node_modules/')) {
    def res = script.sh(script: "sha512sum --status --check ${packageLockChecksumFile}", returnStatus: true)
    if (res == 0) {
        canSkipNpmCi = true
    }
}

if (! canSkipNpmCi) {
    script.nodejs(pipelineConfig.nodejsVersion) {
        script.sh("npm ${pipelineConfig.npmParameters} ci")
    }

    // update the checksum file
    script.sh("sha512sum ${packageLockFile} > ${packageLockChecksumFile}")
}
else {
    script.echo('The checksum of package-lock.json has not changed since last build, so skipping npm ci')
}

For those who are looking for a NodeJS port:

var fs = require('fs');
var crypto = require('crypto');
var childProcess = require('child_process');

var checksumFile = './.package-lock.sha512';
var packageLockFile = './package-lock.json';

var contents;
// Check if the checksum file exists
if (fs.existsSync(checksumFile)) {
    contents = fs.readFileSync(checksumFile).toString();

} else {
    contents = '';
}

// Compare with sha512 of package-lock.json
var packageContent = fs.readFileSync(packageLockFile);
var hash = crypto.createHash('sha512')
hash.update(packageContent);
var sha512sum = hash.digest('hex');
if (sha512sum === contents) {
    // If checksums are equal skip npm ci
    return process.exit(0);
} else {
    // Else exit with 0 code after writing
    fs.writeFileSync(checksumFile, sha512sum);
    childProcess.exec('npm ci');
}

Put this under scripts/conditional-install.js and then run npm run conditional-install after inserting conditional-install: node scripts/conditional-install.js in package.json

Make sure .package-lock.sha512 is in .gitignore

As for the problem with npm install modifying package-lock.json, which was raised several times in this thread, it seems that the --no-save option is already meant to prevent this. It has been suggested here: https://github.blog/2021-02-02-npm-7-is-now-generally-available/#changes-to-the-lockfile.

Anyway, as described in the documentation, and also mentioned in the comments here and here, package-lock.json should not get updated just because a new version of some dependency gets released, unless we specify that newer version in package.json.

From my experience, whenever the two files are consistent, package-lock.json works as expected and npm install can be used for deterministic incremental installs instead of the slower npm ci.

It's not really about npm install modifying the package-lock.json, it's about npm install ignoring the package-lock.json and deferring to package.json as the final authority on what needs to be installed.

From my experience, whenever the two files are consistent, package-lock.json works as expected

What we're talking about here is what happens when those two files aren't consistent. The current behavior of NPM still doesn't match the desired behavior that people have expressed in this thread as far as I can tell. I have a patch I've been working on; it needs some unit tests, and I'm waiting for feedback on it, but it's closer to the desired behavior.

I feel like there's some disconnect between what people think is being asked for and what is actually being asked for. npm install --from-lockfile should act like npm ci (treating the package-lock.json as gospel truth regardless of what package.json says) except for deleting the node_modules folder and reinstalling everything from scratch. Currently, there is no way to get NPM to do that. I'll try to sit down at some point over the next week or so and make a demo to showcase exactly what's happening.

What we're talking about here is what happens when those two files aren't consistent.

I may be missing something, but I can't think of a situation when they could end up being inconsistent in a repository, unless you manually modify and commit one of them without running npm install, which sounds like asking for trouble.

The current behavior of NPM still doesn't match the desired behavior that people have expressed in this thread as far as I can tell.

Some of the desired behaviour expressed in this thread:

npm i would be a great but only if it wouldn't change the lock file. I've seen the package-lock.json been updated during an npm i or should that not happen?

I support this feature. As stated, npm i modifies package-lock.json. A flag would be the ideal solution.

[...] it should not update the package-lock.json and should not remove node_modules

There is NO way to do an install that doesn't change package-lock.json and doesn't delete node_modules.

There should be a way to run npm ci -flag or other npm magic that installs only missing packages or re-installs packages which version does not match package-lock.json. "to do an incremental update"

For me, this is solved by using npm install --no-save.

But still, of course, a flag to completely ignore package.json could make the installation even faster.

Yes and no. My reading of this is that people are using words like "change" and "update" as synonymous with installing the new dependencies, they're not treating this as two separate problems. The disconnect I have is that, sure, --no-save doesn't modify the package-lock.json, but it's still effectively treating it like it was modified. The end result is the same as running npm install && git checkout package-lock.json.

Maybe I'm misreading the situation, but I don't take these comments as literally meaning that it's specifically the persistence to disk of the changes that's the problem with updating package-lock.json. npm install --no-save does still update the tree, it just doesn't persist those changes by writing them to the file on disk after it finishes installing the new modules.

The --no-save flag is actually only for compatibility between different versions of npm. For me, npm install itself respects the dependency versions from package-lock.json in production environment. See the links in my previous comment. If it doesn't work like that for you, maybe you have an older version of npm or there is some other bug.

Oh, this is somewhat fixed on more recent versions! Okay, this actually explains a lot, I couldn't figure out why people were saying that they couldn't reproduce. It's just not fixed in the current Node LTS release, and I wonder if that's the disconnect.

Here are the differing results I'm seeing:


Package.json:

{
  "name": "npm-test",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {},
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "lodash": "latest"
  }
}

Lodash 4.17.20 installed (one version behind).


With the current LTS release of Node:

$ nvm install lts/*
Now using Node v14.17.6 (npm v6.14.15)

npm install will upgrade you to Lodash 4.17.21, it prefers obeying the "latest" tag in package.json over the contents of package-lock.json. I don't know if this is expected behavior or a bug, but from a user-perspective the current LTS does not respect package.json.

It's hard to reproduce this quickly without manually modifying a package.json to force the issue, but hopefully with the above it's clear how this issue could arise naturally without ever manually modifying your package.json: you just need to install lodash with "latest" before 4.17.21 is released. That's the kind of behavior that I at least (and I assume other people on this thread as well) have been wrestling with for a while.


However, if I'm not using the LTS, if I'm on NPM 7:

$ nvm install node
Now using Node v16.9.1 (npm v7.21.1)

Now if I run npm install with the above, I still get Lodash version 4.17.20 (correct behavior!). Note that this happens regardless of whether I'm upgrading from a lockfile version 1 or whether I already have a lockfile version 2. That suggests to me that this isn't just a change in behavior between formats, it suggests to me that NPM v7 treats even version 1 lockfiles differently.


So here's the situation as far as I can see it. Either:

  • This is an expected breaking change between lockfile version 1 and version 2, in which case I'm not sure why dependency versions are preserved during npm install when upgrading from lockfile v1 to v2.

or

  • This is a bug with the LTS and even lockfile v1 shouldn't be exhibiting this behavior.

The problem is that the LTS is what the vast majority of people are going to be using (particularly in production environments), so if this is a bug, it seems like it would be a good idea to backport the fix of respecting package-lock.json to NPM v6. If it's not a bug and that's expected behavior, it seems like it would be a good idea to better document this somewhere. Maybe this won't be an issue because the next LTS for Node begins in October? But the maintenance window for Node v14 is still going to last until 2023, so... I don't the stats on who is using what. But it's not just outdated versions of NPM/Node that have this problem, it is the default-selected download link on the main Node site that will exhibit this behavior.


My take on this is that I still kind of want a --from-lockfile that ignores package.json. The NPM 7 behavior fixes the vast majority of problems, but it doesn't fix all of them, and it seems like it could still be more predictable.

If I swap out my dependency in package.json to be "lodash": "3" and don't edit my package-lock.json, NPM 7 will downgrade that package when I run install. Yes, that's a kind of contrived example, I don't think that's something people will run into very often. But it's not respecting the lockfile, it's still treating the package-lock.json as if it's a suggestion to be taken into account only after making sure that it's compatible. In contrast, npm ci handles that situation fine (except for the part where it deletes the node_modules folder and recreates it).

I think that issue is more minor though. If the LTS had its bug fixed (or if there was some clarification that the LTS behavior is intended, and it's intended to be a breaking change between NPM 6 and 7 how it treats dependencies) then my guess is that would probably resolve the majority of people's issues. Or at least letting people know that the LTS exhibits different behavior and that they can get more consistent installs by switching off of it.

I would love some kind of exhaustive list of when npm install is going to mess with the lockfile, because even NPM 6 sometimes respected it. And NPM 7 is respecting it more, and still occasionally respecting it even when the package.json changes, but still isn't always respecting it. And it's not clear to me exactly what triggers NPM to decide it's going to fetch a bunch of extra dependencies.

If I manually edit my V2 package-lock.json in NPM 7 so that the cached reference to the pinned dependency is different, and then I run npm ci, it still downloads and installs the correct version based on the lockfile's specified URL/version/integrity check. That's the kind of consistency I would love to be getting out of npm install.

@nmm-shumway you can manually upgrade npm and this is also recommended in most cases: npm i -g npm@latest

Yeah, honestly that's probably what I'll recommend doing at a work. We would always have needed to update our internal npm version for any patch anyway, so there's really no reason for us not to be using v7 and migrating off of lockfile version 1 already.

This advice does raise a question though: if it's recommended to use v7 with the current LTS version of Node, is there a reason why the website and nvm still ships the LTS with v6? I guess it's a different conversation to have with different people on a different issue tracker, but would it be possible to get those updated so they ship with the correct dependency out of the box?

It still feels odd to me that the default version of npm that people are downloading won't have correct behavior. Or at least I assume it's a bug and not intended behavior. Just purely for clarification -- in the above scenario I bring up, am I correct in assuming that the v7 result is the intended behavior, not the v6 result?

@nmm-shumway v6 to v7 is a breaking change, so a node major that ships v6 will always do so. LTS status has nothing to do with it.

Should this bugfix be backported to v6 then? I still don't understand whether v6 ignoring the package-lock.json when new versions of libraries get released is expected behavior or not.

@nmm-shumway id hope so, but my guess is that it’s unlikely since v7 is completely rewritten. However users are expected to upgrade npm to the latest that works on their node version anyways, so if you’re on node 10+ you should be on npm 7+.

Meh, this is fine. My org can upgrade the npm version we're using, so this doesn't really matter to me beyond curiosity.

It seems a little bit weird to me that users are expected to upgrade across versions in an LTS if those versions have breaking changes, and if the breaking changes aren't a problem, it seems a little weird to me that they're being treated as a blocking issue preventing shipping inside the LTS distributions. But, again, I don't actually need a resolution on that, from what I can see in my own testing, v7 fixes the majority of the issues I'm having, so at least for my org the answer is definitely just to upgrade.

I appreciate the responses, swapping to v7 should make our dev/testing builds a bit more reliable, so I'm pretty happy about that.

This issue is a mess. Can somebody please link to the single RFC for the requested feature: "npm install based on package.json, with awareness of existing node_modules dependency tree, without updating package_lock.json"