throttle npm http requests
zenwork opened this issue · 20 comments
Hi,
Is there a way to throttle npm when calling 'install' or 'update' so that dependency resolution does not create so many concurrent http calls?
npm is causing some trouble with on of our corporate firewall. We have our own internal npm repo and the issue is occurring between the npm client and our repo.
At times too many concurrent requests are going through and causing the firewall to throw HTTP 503 errors back. This does not kill the npm install but results in builds taking 4 to 5 times longer than usual. Sometimes it seems to cause some fatal timeouts or the dreaded 'cb() never called'
I have already tried to get the firewall changed to a more lenient setting but it is not possible as this is a wanted feature on this firewall.
cheers,
Florian
I will give this a try. But I don't think this will scale as I have a lot of developers who would need to use crapify to avoid this encountering this firewall issue.
My thought was that you could install it sitewide and see if it helps (that's what it was written to do). crapify
exists because you're not the first person to run into this issue. It's not common, per se, but there are a few companies out there with heavily-managed internal networks where Node's style of concurrently making requests in very short order trips up something. Unfortunately, we've spent a fair amount of time trying to tweak the various request
and other settings we have control over, and none of the easy stuff has addressed this problem in a few cases.
To make npm throttle its connections in an orderly way would mean two things:
- We'd at least have to write our own
http.Agent
implementation to enforce the concurrency limits, because trying to do it at a higher level (i.e. by settingmaxSockets
inrequest
) hasn't worked when we've tried it. It's possible we'd have to move off ofrequest
entirely to get an adequate level of control, which would be a huge commitment, because we rely very heavily onrequest
's features (for example, almost all of npm's proxy support comes fromrequest
). - Installs will run much more slowly, because one of the secrets to npm's speed is that it just grabs everything it needs from the network at once. This argues against making this change everywhere at once.
I'm completely sympathetic to your need to come up with a solution to this problem, because all of the companies we've encountered running into this difficulty are large, powerful enterprises we'd very much like to make happy. The trick is finding a way to do it that doesn't require us to put what would probably be at least a month of effort into rewriting one of the core components of the product.
I am experimenting with crapy. I still don't know if it will be a viable solution for our problem. At the moment it is quite good in helping me debug issues. By eliminating issues that can be caused by concurrency I can focus on other issues that are also happening.
As a side note, It would be really nice to be able to enable time stamps in npm logging. This would make using the logs as part of internal support requests easier.
crapify solves one problem but leaves me with an other. I need to use a proxy for external calls (ex: postinstall call to github for certain libs). By defining a proxy in the npm config all those calls seem to now go through the crapify proxy and never find github.
@zenwork that's interesting that the GitHub requests aren't getting proxied properly. Could you share a package.json snippet with me that fails? Hopefully something will jump out at me that I can patch.
Here is the anonymized package.json I am using. Not that I am using artifactory 3.4.1 as an npmjs cache.
{
"name": "project",
"version": "0.2.11",
"description": "My Project",
"repository": "ssh://git.myco.com/project",
"homepage": "http://confluence.myco.com/",
"bugs": "http://myco.com/browse/PROJ",
"author": "My Company",
"license": "AAA",
"main": "cli.js",
"bin": "./cli.js",
"scripts": {
"test": "grunt jasmine_node",
"jasmine": "jasmine-node --color ./src/lib"
},
"dependencies": {
"JSONPath": "^0.10.0",
"angular-gettext-tools": "^1.0.3",
"archiver": "^0.11.0",
"atomify": "^6.0.4",
"fs-extra": "^0.10.0",
"gift": "0.4.2",
"glob": "^4.0.5",
"http-proxy": "^1.4.3",
"inquirer": "^0.5.1",
"lodash": "^2.4.1",
"mem-cache": "0.0.4",
"nopt": "^3.0.1",
"npm": "2.1.7",
"npmlog": "0.1.1",
"q": "^1.0.1",
"request": "2.42.0",
"semver": "^4.0.3",
"unzip": "^0.1.11"
},
"devDependencies": {
"my-release-tool": "~1.1.0",
"grunt": "0.4.5",
"grunt-jasmine-node-new": "~0.3.2",
"jasmine-node": "~1.14.5",
"load-grunt-config": "~0.14.0",
"node-mocks-http": "^1.1.0"
}
}
I got a piece of information from the firewall team which could help. It seems that too many sessions rather than too many requests are generating all the HTTP 503 from the firewall. Does that make any sense to you? I can't imagine that npm is trying to open an HTTP session with each call.
If you install a package that does not have git dependencies, does
throttling the concurrent connections fix the issue?
On Tuesday, January 27, 2015, Florian Hehlen notifications@github.com
wrote:
I got a piece of information from the firewall team which could help. It
seems that too many sessions rather than too many requests are generating
all the HTTP 503 from the firewall. Does that make any sense to you? I
can't imagine that npm is trying to open an HTTP session with each call.—
Reply to this email directly or view it on GitHub
#7200 (comment).
The source has been found. Our firewall was creating an http session for each npm request (creating a cookie). So we were hitting the max sessions/IP limit and causing everything to go very bad. Things are back to normal now that they have removed this 'feature'.
thanks for all the help. If nothing else it has been very instructive :-)
I'm going to close this as resolved (for now?!). Let us know if you run into further issues, @zenwork!
I've experienced it so many times.. Please re-open this. It's especially painful on El Capitan, it can't handle too many concurrent connections and crashes wi-fi...
I could confirm the issue. I'm also using El Capitan and every time npm install
takes the whole bandwidth effectively blocking any other http request.
This problem remains and really is a issue. A usual download manager would only allow about three http connections at a time. If you install a fresh set of npm modules from scratch, npm may open > 200 http connections at once, which I've seen block the network connectivity of a complete household several times. This happens reliably and reproducible and is really annoying. Not only this may block npm itself, since it is scheduling some hundred stale connections and not progressing any more. It also blocks other user's internet connections, if npm has used up the maximum number of connections available for that household. This is really a pain. Is there any progress on this issue?
Edit: The committers seem to be aware of that problem, but through another ticket: #4553
The CLI team has as a high-level roadmap item improving the performance and robustness of npm's networking code, which in all likelihood means replacing request
with a purpose-built download manager. If you're interested, see this proposal for a description of what that replacement will look like, in broad strokes. The team has a few other projects between now and then, but I anticipate that we'll start work on reworking the network code sometime early in the second half of 2016.
Same problem here, and I have gigabit Internet... Seriously. You should at last in the meantime add an option to throttle the number of simultaneous connections.
You should at last in the meantime add an option to throttle the number of simultaneous connections.
The CLI has been doing this since npm@3.8.0
. The default is 50 simultaneous connections max. See the link for details on how to configure this.
Thanks for the info! This is very good to know. I was able to make it work by disabling firewall + rebooting router, but if the issue ever happens again I'll give it a try.
I have posted this info at a few places where people encountered the issue and this option was never mentioned.