goldbergyoni/nodebestpractices

Docker best practices - call for ideas

goldbergyoni opened this issue Β· 22 comments

Given the immense popularity of Docker and the need to harden it different per platform (see ideas below) - we'd like to start writing a Docker best practices section.

You're welcome to contribute ideas and write best practices - writing and brainstorming will people is an amazing way to deepen your Docker understanding.

At first, we want to collect ideas for best practices, solidify a list of 10-15 bullets and then assign the ideas to writers.

@BretFisher @lirantal @giltayar @js-kyle @BrunoScheufler @kevynb

Idea: Start the production process with 'Node' command, avoid 'npm start'

What is this about: 'npm start' won't pass KILL signal to the process which is very frequent at environments with dynamic scheduling like k8s, this will prevent a graceful shutdown

Why is this important: Highly (only) related to Node.js

Idea: Set Docker memory limits which are in-par with v8 memory limit

What is this about: In recent version s,v8 memory allocation is configurable, Docker & k8s also allows setting quotas, this should now be synchronized (I have to research on this better, for now it's just an idea)

Why is this important: Highly related to Node.js/v8

Many things I can think of :-)

  1. Don't use "latest", use a digest
  2. Use multistage builds
  3. Scan your image for vulnerabilities
  4. Prefer smaller images

I have many more of these here: https://snyk.io/blog/10-docker-image-security-best-practices/ and don't let the title fool you, even though I have discussed them as the security angle, they are nonetheless all important practices to take on.

Thank you the all-mighty @lirantal

Adding one more:

Idea: Graceful shutdown

What is this about: When dockerized runtime kills a container, exiting efficiently makes the difference between disappointing ~1000 users vs zero errors

Why is this important: The implementation (code examples) are highly related to the webserver

I've got one more. Actually not sure if we have to add it to the docker best practices for node or to node best practices (I did not find it with a quick read)

Idea: Install packages for production

What is this about: When running npm install in the docker image, we should make sure to only install needed packages by running with the --production flag.

Why is this important: Do not ship devDependencies in your docker image. It could contain vulnerabilities and it adds weight to the image.

kikar commented

Idea: Use the node user

What is this about: before the CMD make sure you have a line with USER node.

Why is this important: It restricts the permission of the app running not in root, better for security.

@kevynb Sounds important, seems to me like a great BP for the production list, what do you think?

@kikar Yes, absolutely. Any example of a specific security issue it might prevent?

@goldbergyoni True, I'll try to write something about it this week πŸ˜ƒ

@goldbergyoni The best summary that I saw in 2019 about Nodejs and Dockerfiles best practices is from @BretFisher in the "Docker and Node.js Best Practices from Bret Fisher at DockerCon". The source from that talk here https://github.com/BretFisher/dockercon19.

In most cases, I would write what he said in that talk.
However, I would like to emphasize here some interesting items:
1. Multistage Dockerfile with specifying a target build stage (--target) are very powerful together.
2. How to use the Private NPM Registry from Docker to prevent show credentials from docker image history? Use NPM_TOKEN(see pages one and two) in build step and docker squash after it.
3. Properly handle HTTPS connections. Why we need it described here. Possible helpers here are: stoppable, http-shutdown and http-close

@bi0morph Thanks for bringing this gold stuff, I watched this video and would call it 'best Node video of 2019' :)

1. Multistage Dockerfile with specifying a target build stage (--target) are very powerful together.

What title would you propose here? Keep test and production images as close as possible? Do you see a need for dev stage? I didn't see many developers who enjoy developing with Docker (for their own code)

2. How to use the Private NPM Registry from Docker to prevent show credentials from docker image history? Use NPM_TOKEN(see pages one and two) in build step and docker squash after it.

Yes, great idea. What is the title here? don't leave secrets within images?

3. Properly handle HTTPS connections. Why we need it described here. Possible helpers here are: stoppable, http-shutdown and http-close

Why is this related to Docker? because the reinforced need to shut down gracefully?

@kikar

By looking at nodejs/docker-node#1 they bring the following articles:
http://blog.zeltser.com/post/104976675349/security-risks-and-benefits-of-docker-application
http://thenewstack.io/docker-addresses-more-security-issues-and-outlines-plugin-approach
http://www.slideshare.net/jpetazzo/docker-linux-containers-lxc-and-security
blog.xenproject.org/2014/06/23/the-docker-exploit-and-the-security-of-containers

Couldn't spot there an example of a specific attack, did I miss it?

@bi0morph a few things. I've been following the thread, thanks for the plug and everything's looking good, and I have some ideas:

  1. On Multistage, there's some good examples from my last Dockerfile example from my DockerCon19 repo.
  2. Using Private NPM Repo can likely upgrade from the oldschool method of squash and likely use a modern method like:
    A. Use multistage where the env var is injected into one stage, but not in a future stage. Copy the code and leave the env var behind.
    B. Use the newer buildkit building method with secrets and volume mount support, as well as private SSH key support.
  3. On HTTP/S connections... this is two things:
    A. Making sure you handle SIGINT properly, ideally with code, use tini as a backup plan.
    B. For anything with a long-running connection, like HTTP/S, it's critical to track connections in-app and properly shut those down after SIGINT is received so things like user uploads, large downloads, and sessions are handled gracefully. This isn't docker specific, but often docker will let teams accelerate their deployment updates, so when I work with teams on adopting containers, this becomes a bigger issue as they go from "deploy monthly" to "deploy daily or weekly".

One more idea - usage of .dockerfile linter like:
https://github.com/hadolint/hadolint

Maybe something applicable from here http://docs.projectatomic.io/container-best-practices/

@BretFisher My two cents:

  • Multistage docker only for install and build. I don't see why I need a lint & test stage. Thats part of a CI pipeline.
  • Is tini still important? The docs say: NOTE: If you are using Docker 1.13 or greater, Tini is included in Docker itself.
  • Anyway, I like your DockerCon examples :)
  • For many teams, part of CI is moving into the container build, just like security scanning (microscanner, etc.). You can always add comments that their optional stages, but don't assume no one will want to see how it's done.
  • All that means is that Docker has it for optional use, with --init during a docker run. The advice is that if you need tini for proper node shutdown, then it should always be started with tini in the dockerfile. You would need tini if you don't handle shutdown in code.
  • Thanks :)

@bobaaaaa I kind of think the same of the purpose of multistage docker. I don’t really like using a build step to run tests.
However I couldn’t find a good and easy alternative to that problem. How do you handle testing and linting in those cases?

Do you have a separate docker build to run tests and linting ? Or do you run them outside of a docker (which for me ruins the interest of using docker)? Or maybe something else?

When your CI is called Jenkins than multistage docker with lint and test makes sense (maybe). I prefer some container based CI solution like TravisCI, CircleCI or currently github actions. I used all of them in different projects and I'm quite happy.

From my current project:

  • CI is github action with ubuntu images
  • PR: lint, test, build, coverage
  • Master commit: lint, test, build (prod), build docker-image, publish to ECR (AWS)

Because of the ubuntu images I don't use multistage docker at all (currently). Luckily I don't need docker-compose.

Those are some things I've seen in my company:

  • To download packages from a private repository, don't hardcode your credentials in the Dockerfile, use build args and https://www.npmjs.com/package/npm-cli-login
  • Use dotenv and dotenv-cli to manage your environments variables when you develop, this way your code is "environment variables ready" for docker
  • It seems obvious, but you should always npm install --production to avoid extra packages in your image
  • and please, please, don't run webpack-dev-server in a docker image to serve a react, vue or angular application

To download packages from a private repository, don't hardcode your credentials in the Dockerfile, use build args and https://www.npmjs.com/package/npm-cli-login

This is not a good advice actually since build-args are cached in the store where the image is built. Instead you should use multi-stage builds (see number 9 here: https://snyk.io/blog/10-docker-image-security-best-practices/)

Use dotenv and dotenv-cli to manage your environments variables when you develop, this way your code is "environment variables ready" for docker

Just please don't make it a habit of putting secrets there πŸ™

and please, please, don't run webpack-dev-server in a docker image to serve a react, vue or angular application

πŸ‘Œ

stale commented

Hello there! πŸ‘‹
This issue has gone silent. Eerily silent. ⏳
We currently close issues after 100 days of inactivity. It has been 90 days since the last update here.
If needed, you can keep it open by replying here.
Thanks for being a part of the Node.js Best Practices community! πŸ’š