moby/buildkit

Network Issue with using RUN in dockerfile in Windows Builds

Closed this issue · 11 comments

Running a buildctl process, and it seems that the RUN command has no network access.

I can’t run nslookup, ping, and can’t resolve any DNS when attempting to run anything that involves making a request.

This works fine using docker. And using other commands such as ADD works also.

FROM mcr.microsoft.com/windows/servercore:ltsc2022
RUN C:/Windows/System32/WindowsPowershell/v1.0/powershell.exe -command "Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))"

This would result in the inability to resolve the DNS
community.chocolatey.org

The remote name could not be resolved: community.chocolatey.org

On latest default Windows EKS AMI 2019 and 2022, but also on different variations of Windows 2019 + 2022 we build in-house.

Also latest containerd, buildkit versions

buildkit 13.2 and containerd 1.7.17

I installed docker on the same host and it runs fine there.

Can share the sample repro dockerfile using, and your platform and buildkit version details?

@profnandaa updated the issue description with the requested info.

Confirming the repro, DNS resolution issues too even when using curl.exe. I suspect that this is to do with containerd CNI setup (which needs to be included in the documentation). Let me take a closer look at this tomorrow and get back.
/cc. @gabriel-samfira

Yes, the documentation is missing the section on setting up proper CNIs. @profnandaa, my old guide contains setup steps for the CNI.

In short, we need to add to the docs instructions on downloading the CNI binaries, writing the config for the CNI and setting up the service to use the CNIs.

After the CNIs are set up, you can register buildkitd as a service using:

& "C:\Program Files\buildkit\buildkitd.exe" `
    --register-service `
    --service-name buildkitd `
    --debug `
    --containerd-worker=true `
    --containerd-cni-config-path="C:\Program Files\containerd\cni\conf\0-containerd-nat.conf" `
    --containerd-cni-binary-dir="C:\Program Files\containerd\cni\bin" `
    --log-file="C:\Windows\Temp\containerd.log

# Make buildkitd dependent on containerd
sc.exe config buildkitd depend= containerd

You can remove the --debug flag if you wish and set a proper path for the log. Also, you will need to adapt the path to the CNI config and CNI bin dir.. The above command removes the need for nssm.

Edit: @profnandaa there are pre-built CNI binaries. We can add to the docs steps on downloading those. No need to build from source.

Thank you @gabriel-samfira and @profnandaa
I used your guide and made modifications where needed and I got it to work.

Although, it looks to work inconsistently.

one run I'd get a working build, the next, I'd get failed to create shim task: hcs::CreateComputeSystem and The requested operation for attach namespace failed: unknown

Containerd logs doesnt show anything promising that I can see with debugging turned on. Not sure if it's a containerd issue, CNI, or buildkit. But it's a work in progress!

@ehuizar1028 -- seen a similar issue previously reported here containerd/containerd#5729 ; it's to do with the CNI, and perhaps around the CIDR config. I'm currently testing out various setups to make sure we have one definitive guide for setting this up. The guide should most likely live close to source at https://github.com/microsoft/windows-container-networking and linked from buildkit, containerd, etc. Working on that.

@ehuizar1028 -- could you share with me the dump for your:

Get-HnsNetwork | where { $_.Name -eq 'nat' }

and the contents of your 'C:\Program Files\containerd\cni\conf\0-containerd-nat.conf'; is their any mismatch in the CIDR (e.g. AddressPrefix=172.31.192.0/20) vs what you have in your conf?

...
Subnets                : {@{AdditionalParams=; AddressPrefix=172.31.192.0/20; Flags=0; GatewayAddress=172.31.192.1; Health=;
                         ID=E018ED9C-E2EE-42E6-AB2A-81EBFED99002; IpSubnets=System.Object[]; ObjectType=5; Policies=System.Object[]; State=0}}
....

Thank you @gabriel-samfira and @profnandaa I used your guide and made modifications where needed and I got it to work.

Although, it looks to work inconsistently.

one run I'd get a working build, the next, I'd get failed to create shim task: hcs::CreateComputeSystem and The requested operation for attach namespace failed: unknown

Containerd logs doesnt show anything promising that I can see with debugging turned on. Not sure if it's a containerd issue, CNI, or buildkit. But it's a work in progress!

try:

stop-service buildkitd
stop-service containerd

Get-HnsEndpoint | Remove-HnsEndpoint

start-service containerd
start-service buildkitd

and see if that fixes it. Containerd will recreate the network in HNS automatically.

UPDATE:
Seems like from buildkit end, ipam configs are not even needed (coz the network created already has the details?). I got it to working with only a minimal config like this.

{
  "cniVersion": "0.3.0",
  "name": "nat",
  "type": "nat"
}

However, it fails when you add the wrong ipam details that don't match the nat network created.

UPDATE: Seems like from buildkit end, ipam configs are not even needed (coz the network created already has the details?). I got it to working with only a minimal config like this.

{
  "cniVersion": "0.3.0",
  "name": "nat",
  "type": "nat"
}

However, it fails when you add the wrong ipam details that don't match the nat network created.

buildkit uses the containerd executor to spin up containers. We need to simply point it to the containerd CNI config and CNI dir and it should be fine.

@gabriel-samfira -- sounds good. Sending in the documentation update. Will also add the part for registering builkitd as a service, I'd left that in the initial documentation.