multiarch/qemu-user-static

Where is qemu-$to_arch-static binary files when container whose arch isn't equal to host's one is run?

agrexgh opened this issue · 8 comments

Is this a bug report, feature (enhancement) request or question? (leave only one on its own line)

/kind question

Description:

I understand
the command docker run --rm --privileged multiarch/qemu-user-static --reset -p yes enables running the container whose arch isn't equal to host's one.
and binfmt_misc and qemu works for this,
But I could find qemu-$to_arch-static neither in host pc nor in container,
but the program in the container is executed through /usr/bin/qemu-$to_arch-static.
I felt it like a mystery...

Where is qemu-$to_arch-static binary files when container whose arch isn't equal to host's one is run?

Steps to reproduce the issue:

  1. docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
  2. docker run the image whose arch is not equal to host's one.
  3. run ps auwx in the container run at step 2.
  4. run ls /usr/bin

Describe the results you received:

Step3: COMMAND shows "/usr/bin/qemu-aarch64-static *"
Step4: ls' doesn't show qemu-*-static <- it's a mystery..

Environment:

  • Container application: Docker

Output of docker version, podman version or singularity version

$ docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:02:57 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:03 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Additional information optionally:

I found this question at stackoverflow,
this says qemu-$to_arch-static is in multiarch/qemu-user-static.
but multiarch/qemu-user-static container isn't running at that time.
Can we execute binary files in the container which isn't running?

I edited my 1st post according to issue template.

zxdvd commented

When you run docker run --rm --privileged multiarch/qemu-user-static --reset -p yes, it will register and LOAD those binaries.

The -p enables the persistence (see here in qemu repo) which enables the flag F of binfmt_misc. And you can get details of the flag F at kernel doc.

so the F mode opens the binary as soon as the emulation is installed and uses the opened
image to spawn the emulator, meaning it is always available once installed, regardless of how
the environment changes.

And the kernel source code here.

	if (e->flags & MISC_FMT_OPEN_FILE) {
		f = open_exec(e->interpreter);
		if (IS_ERR(f)) {
			pr_notice("register: failed to install interpreter file %s\n",
				 e->interpreter);
			kfree(e);
			return PTR_ERR(f);
		}
		e->interp_file = f;
	}

@zxdvd Thank you for replying me!
I'd read the documents you told me.

I'm not familiar with kernel, so it may take a bit long time to read docs.

@zxdvd Excuse me, to be crarified, I have a question.

When you run docker run --rm --privileged multiarch/qemu-user-static --reset -p yes, it will register and LOAD those binaries.

Where are the binaries(=interpreters) loaded into?
I understand that
the interpreters are loaded into kernel memory for binfmt function and kernel is shared among host and containers,
so the bin files whose arch is not equal to host's one can be executed in containers.
Please correct me if I'm wrong.

zxdvd commented

@agrexgh I think you are right. When flag F enabled, the interpreters are opened and the returned file descriptor is recorded and binfmt_misc uses the file descriptor directly instead of finding the path and opening it.

Thank you, @zxdvd!
I understood well how to work!
So I close this issues.