[aws-lambda-nodejs] a single lambda creation takes 30s every time
vvo opened this issue ยท 23 comments
tl;dr;
Node.js lambda are slow to build because CDK uses docker and mounting a complete project inside docker is super slow on macos.
There's work being done on Docker side, but it will probably always be very slow for now.
There's work being done on CDK to allow using a local parcel instead of inside docker, see
#9632.
For an alternative solution, I created https://github.com/vvo/aws-lambda-nodejs-webpack which uses webpack locally and is a lot faster
Hi there, when using aws-lambda-nodejs, every cdk synth will take up to 50s, even for a very simple setup. I understand aws-lambda-nodejs uses docker + parcel but even that, with the right cache, performance should not be an issue on successive runs because of Docker / parcel cache. So I guess something is weird here.
As discussed over email with you @jogold, I am creating a new issue that you can reproduce on your side.
Reproduction Steps
I created a repository: https://github.com/vvo/aws-lambda-nodejs-performance
The stack code is:
const cdk = require("@aws-cdk/core");
const lambda = require("@aws-cdk/aws-lambda-nodejs");
class HelloCdkStack extends cdk.Stack {
constructor(scope, id, props) {
super(scope, id, props);
new lambda.NodejsFunction(this, "my-handler");
}
}
module.exports = { HelloCdkStack };
Reproduction:
git clone https://github.com/vvo/aws-lambda-nodejs-performance.git
cd aws-lambda-nodejs-performance
npm install
time cdk synth
# [... cdk output]
cdk synth 1.82s user 0.63s system 8% cpu 27.730 total
Here it's 27 seconds, but I have seen it go up to 50s.
Error Log
No error message
Environment
- CLI Version : 1.51.0
- Framework Version: 1.51.0
- Node.js Version: 12.18.2
- **OS : MacOS 10.15.5
- Language (Version): JavaScript
Other
I'd like to debug further and understand why, maybe, the cache is not used fully but not sure how to do that.
This is ๐ Bug Report
@jogold Just updated the repository (https://github.com/vvo/aws-lambda-nodejs-performance), same results, it always takes at least 20s on successive launches of cdk synth
.
Is there a way to see the output of the docker build to see what's going on?
I am not the author of this wonderful package (aws-lambda-nodejs) but it really seems like maybe the yarn/npm cache is not kept in between docker builds maybe?
Thanks!
Had a discussion with @vvo, the issue seems to be linked to Docker performance issues on his machine. Still investigating at this stage.
Quick tl;dr; on the status of this issue: it's a macos issue. macos filesystem performance through docker containers has been "bad" for some time already (docker/for-mac#1592), but a lot of progress was made.
The latest update is coming on the docker for mac edge version https://docs.docker.com/docker-for-mac/edge-release-notes/ where by default they use a new syncing mechanism called "mutagen" (https://mutagen.io/) which makes things faster.
Still, on a real-size Node.js project (~ 600MB of node_modules resulting in A LOT of files), mounting the filesystem to docker takes 20s with mutagen (100s without mutagen).
The only issue right now is that for mutagen to work well, for now, you have to "flush" data to the host (otherwise asset output is empty when we check it). See docker/for-mac#1592 (comment).
Let's wait for the Docker team to reply, but macos filesystem under docker is heading to the right direction.
Alternatively, a community-supported nodejs-lambda that would directly use Parcel instead of using Parcel inside docker would make it way way faster, as discussed with @jogold. But I understand this is not the direction the cdk team has headed towards, as @jogold explained to me and that's fine!
Thanks all :)
@eladb Docker performance is really bad on macOS and especially for the NodejsFunction
where we need to mount the project root, which can be "heavy". On Windows and/or Linux (so also in CI) performance is acceptable.
what do you suggest here?
Didn't we configure the volume mapping to use some sort of special mac sauce?
The volume mount uses :delegated
but with the Edge
Version of Docker4Mac this made it worse. It takes forever or crashes. And if it works by chance it fails afterwards because the sync is not yet done and CDK does not find the assets.
The volume mount uses
:delegated
but with theEdge
Version of Docker4Mac this made it worse. It takes forever or crashes. And if it works by chance it fails afterwards because the sync is not yet done and CDK does not find the assets.
What I would like to understand is where all the time is "lost": when mounting the volumes? when Parcel runs? when content is written to /asset-output
? What about using cached
and/or read only
for /asset-input
? What is the influence of the size of /asset-input
?
when mounting the volumes?
when Parcel runs? (+when content is written to /asset-output?)
As you know, I did a lot of testing on this, results:
On docker stable version: both actions are extremely slow. Mounting is slow (node_modules, lots of files) and parcel is slow (lots of I/O). On my current project, cdk synth
can take up to two minutes, at every run. 30s was for an almost empty Node.js project (no dependencies).
On docker edge/beta version: only the mounting is slow (still 5x faster than stable), afterwards parcel is as fast as native/without docker. On docker edge, :delegated currently uses mutagen by default (may or may not be the end of the story once it reaches docker stable, it might be a different flag)
:delegated
is always the fastest for any operation on docker mac os, as seen here:
- https://www.amazee.io/blog/post/docker-on-mac-performance-docker-machine-vs-docker-for-mac
- https://www.jeffgeerling.com/blog/2020/revisiting-docker-macs-performance-nfs-volumes
Docker edge makes the mounting "faster". But as for my testing goes, 20s to mount a real world Node.js project will never be acceptable... By real world I mean my project, approx 600MB node_modules, which is not even big compared to more "enterprise" projects I have seen.
We can wait for docker to be faster (but it has been "slow" on macOS filesystems mounting for 3 years already) OR we can make it so it does not use docker on mac. I understand this is not an area you want to go but unfortunately, the road you took will, for now, lead to bad performance on mac for anyone trying to build and deploy Node.js lambdas on AWS using this tool.
This will impact A LOT of people. Data I found is from the 2018 Node.js developer survey: MacOS is leading the Node.js developer environment with 41% of users (Windows at 24%), https://nodejs.org/en/user-survey-report/#Primary-OS-Distro. This means that for most people, the experience is not good for now. And those are the people that will try cdk and abandon it if they have the choice.
For comparison, I tried https://github.com/pulumi/pulumi and it was instant almost for every command, I still want to use CDK because I do not want/need multi-cloud. And pulumi has some weird defaults as for file organization and how it builds lambdas.
Other areas to test:
- yarn v2 performance. What might be impacting performance may not be the SIZE of the project but rather the number of files of the project. And unfortunately, node_modules are a LOOOOOT of files to mount. While new versions of yarn have fewer files in node_modules if I remember well ("pnp"?)
- as for running parcel outside of docker, npx is GREAT! https://github.com/npm/npx#readme. It just means that on first launch it would install it, without even installing it globally and then executes it. npx is bundled with Node.js since 8.2.0
What about using cached
We tried, I tried, results are worse.
Other tests done:
- read only + delegated on BUNDLING_INPUT_DIR: no change versus just delegated
- read only + cached on BUNDLING_INPUT_DIR: no change versus just delegated
- cached on BUNDLING_INPUT_DIR: no change versus just delegated
- again, when using delegated or cached, the slow part comes from mounting, parcel is fast as seen in the docker output: parcel takes always 5s while the whole docker run takes 20s
Naive question: does the node_modules have to be mounted into the docker container? Could we do the package install inside of the container instead?
Naive question: does the node_modules have to be mounted into the docker container? Could we do the package install inside of the container instead?
The node modules are bundled so Parcel needs to be able to find them in the container. But we also mount at the project root to be sure to include everything that is referenced in the Lambda code (it could be a util in another package in a monorepo config for example).
For people having performance issues, could you report timings for synth
in those two repos:
- A monorepo example with
NodejsFunction
: https://github.com/jogold/cdk-lambda-nodejs-monorepo - A example with a module using native dependencies (this is the Using AWS Lambda with Amazon S3 tutorial from the AWS documentation done with CDK): https://github.com/jogold/cdk-s3-thumbnail
(Note: edited July 28th 2020 since I moved from rollup to webpack)
Hey there, I finally gave a try at creating an alternative aws-lambda-nodejs. It's here: https://github.com/vvo/aws-lambda-nodejs-webpack.
Now cdk synth|deploy
takes 10s, including the build time of webpack, instead of 200s using the regular aws-lambda-nodejs (again, those 200s are 99% due to mounting the project in Docker, docker performance issue).
Features:
- fast, no-docker CDK construct
- lambda output only contains the necessary files, no README, tests, ... (just like current construct)
- bundling happens in temporary directories, it never writes in your project directory
- source map support
- Babel with preset-env support
- TypeScript support
As said in the README:
I want to be clear: I respect a LOT the work of the CDK team, and especially @jogold, author of aws-lambda-nodejs) that helped me a lot into debugging performance issues and explaining to me how everything works.
I hope this will help people having performance issues on macOS or if they're just looking for an alternative lambda builder.
I yet have to test it in more details in production, but as of now it just works for simple use cases. And we can always add more (I added a roadmap).
Let me know what you think!
For people having performance issues, could you report timings for
synth
in those two repos:
- A monorepo example with
NodejsFunction
: https://github.com/jogold/cdk-lambda-nodejs-monorepo- A example with a module using native dependencies (this is the Using AWS Lambda with Amazon S3 tutorial from the AWS documentation done with CDK): https://github.com/jogold/cdk-s3-thumbnail
So I tested it with the monorepo and the latest Docker4Mac Edge version. The first run took 4 minutes and then failed with no output as the sync was not done in time. Subsequent runs then took about 20 seconds and succeeded.
The first run took 4 minutes
I assume this is because the Dockerfile changed in 1.54.0 and the image had to be rebuilt? Doesn't explain the failure at the end though.
Subsequent runs then took about 20 seconds and succeeded.
This is still a lot... I'm around 7 seconds on average. If you have a minute (or more ๐), could you try with the other repo where a npm install
is run inside the container?
A temporary solution for macOS users could be to use the CDK_DOCKER
env var to provide a custom script that parses the args and runs Parcel locally.
For the other repo the first run was about 3 minutes and subsequent runs about 40 - 50 seconds.
The first-run delay seems to be the sync that Docker does for the folder. It copies the whole folder to the hyperkit vm.
The asset-output error did not happen this time. It seems related to what @vvo described earlier.
First the requested info...
https://github.com/jogold/cdk-lambda-nodejs-monorepo
yarn cdk synth 3.85s user 0.82s system 9% cpu 48.052 total
https://github.com/jogold/cdk-s3-thumbnail
npm run cdk synth 4.77s user 0.75s system 12% cpu 44.762 total
This one is the project I'm working on right now:
npx cdk synth 4.30s user 0.48s system 7% cpu 1:00.98 total
And here's one of my sample projects: https://github.com/elthrasher/cdk-step-functions-example
npx cdk synth 5.09s user 0.94s system 3% cpu 2:39.58 total
A minute is a long time to wait - especially since I must synth to run unit tests. The numbers above are after running several times, so Docker layers should be cached as well as possible.
- I get the reasons for building with Docker, but building the functions one at a time as opposed to being able to have a single parcel/webpack/whatever build with multiple entrypoints isn't going to scale real well. My project has five functions, which is why it takes more than two minutes. I tried to figure out
CDK_DOCKER
, but it seems like I have to provide a docker-compatible API? It would be great if I could just bail out of the whole thing and agree to supply my own parcel (or webpack, which is more complex, but also more mature) build. - I tried putting my stack into my build system, which is Docker-based and it's nearly impossible to get it to work. I must share my docker.sock (which is fine), but doing that means the volume share will go back to the HOST, not the container I'm trying to build in, so paths get crazy since this module attempts to take the path from the container, not host, but the volume will be shared to the host! (see also: https://serverfault.com/a/819371) For this reason, I basically flat out can't use this module without switching to another build system.
- Even though it looks like I need a non-Docker option, if I were to build in Docker, I'd really try to figure out a way to
COPY
the source into the initial build, then just share the output (perhaps viadocker cp
). That would give me the advantage of Docker caching on the build, which is a great thing to have. Very often my team has made a change to the stack that has nothing to do with the functions, yet we must always wait for the functions to build from scratch every time.
I'm at the point where I either have to do a two-step build (build with webpack, then use lambda.Function) or come up with my own construct, but I thought I'd see where you guys were. I can contribute here or go my own way (or maybe there's some sorcery with CDK_DOCKER
you can point me to).
Thanks for listening :)
@jogold feels like we may need to support bundling outside of docker in certain environments. Let's kick off a GitHub issue to discuss.
Perhaps the direction should be to check it the runtime environment has the required dependencies (e.g parcel in the right version) and only if it doesn't, fall back to docker. This way we can get the docker portability without compromising in environments that support local bundling. Curious how much of this we can/should do in the bundling layer itself and how much is lambda specific.
@elthrasher, side note, if you want to try a non-docker lambda builder, have a look at the one I created after having performance issues with the default one, see here: https://github.com/vvo/aws-lambda-nodejs-rollup
@vvo Thanks, I saw that, but I'm using TypeScript. Would be glad to give it a try and give you feedback after you introduce support for TypeScript.
@elthrasher Just moved from rollup to webpack (rollup could not handle well some npm modules bundling) and added TypeScript + Babel preset-env support. Give it a try and let me know if this works for you: https://github.com/vvo/aws-lambda-nodejs-webpack
Small status update here: we are going to offer a way out of Docker. I expect this to be available before the end of August.