opensearch-project/opensearch-ci

[Bug]: Windows CI builds failing to find docker (Update to Run Windows inside docker containers)

mch2 opened this issue · 39 comments

mch2 commented

Describe the bug

Windows CI builds are failing, example: https://build.ci.opensearch.org/job/gradle-check/14914/console

+ docker logout
C:/Users/Administrator/jenkins/workspace/gradle-check@tmp/durable-392d4e2e/script.sh: line 1: docker: command not found
[Pipeline] }
[Pipeline] // script
Error when executing always post condition:
hudson.AbortException: script returned exit code 127
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.handleExit(DurableTaskStep.java:664)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:610)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:554)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

To reproduce

N/A

Expected behavior

Builds should pass and docker tests should run.

Screenshots

If applicable, add screenshots to help explain your problem.

Host / Environment

No response

Additional context

No response

Relevant log output

No response

Hi, @mch2 , could you let me know how do yor trigger the gradle_check in this case? if you had a PR that triggered it, can you send me the PR link?
thanks,

CC @peterzhuamazon

I will take care of this as I have talked to @mch2 offline.
Thanks.

Able to get docker running on Windows with hyperv.

Administrator@<> MINGW64 ~
$ docker version
Client:
 Version:           23.0.6
 API version:       1.42
 Go version:        go1.19.9
 Git commit:        ef23cbc
 Built:             Fri May  5 21:18:35 2023
 OS/Arch:           windows/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.6
  API version:      1.42 (minimum version 1.24)
  Go version:       go1.19.9
  Git commit:       9dbdbd4
  Built:            Fri May  5 21:17:32 2023
  OS/Arch:          windows/amd64
  Experimental:     false



Administrator@<> MINGW64 ~
$  docker pull mcr.microsoft.com/windows/nanoserver:ltsc2019
ltsc2019: Pulling from windows/nanoserver
aaaa081173ae: Pulling fs layer
aaaa081173ae: Verifying Checksum
aaaa081173ae: Download complete
aaaa081173ae: Pull complete
Digest: sha256:fb78bd84ac937f6b1453e19015ccce41636bbeca5fe5bc6dc5c7d55adb4a2bc5
Status: Downloaded newer image for mcr.microsoft.com/windows/nanoserver:ltsc2019
mcr.microsoft.com/windows/nanoserver:ltsc2019

Needs @mch2 to confirm what are the exact images that windows docker is running with.

On windows, if you use hyperv then windows host can only run windows container.
If we need windows host to run linux container, we need to enable wsl2 later on and might have issues.

Please let me know about this.
Thanks.

Also, this can be a good start into these two issues to bring windows integTest with docker host and containers, even building the artifacts on windows docker containers.

Here is a chart showcasing the comparison between different offers of containers on Windows:

Here's a chart comparing some of the key differences between Windows Server with Server Core installation and Windows Nano Server:

Feature Windows Server with Server Core Windows Nano Server
Installation size Larger (several GBs) Smaller (a few hundred MBs)
Attack surface Larger Smaller
Support for GUI Yes (minimal) No
Support for 32-bit applications Yes No
Support for Windows Services Yes Limited
Support for .NET Framework Yes Limited
Support for Containers Yes Yes
Licensing Standard, Datacenter Standard, Datacenter
Available editions All Windows Server editions Standard and Datacenter only

Will try to see if we can bring nanoserver in place to make Windows light wight in build, test, and check.

Thanks.

I eventually get the docker container running the nanoserver on Windows:


PS C:\Users\Administrator> docker images
REPOSITORY                             TAG        IMAGE ID       CREATED       SIZE
mcr.microsoft.com/windows/nanoserver   ltsc2019   82ef3885248c   2 weeks ago   252MB

PS C:\Users\Administrator> docker run 82ef3885248c
Microsoft Windows [Version 10.0.17763.4645]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\>

PS C:\Users\Administrator> docker ps -a
CONTAINER ID   IMAGE          COMMAND                    CREATED              STATUS                          PORTS     NAMES
4aced3bb72dd   82ef3885248c   "c:\\windows\\system32…"   About a minute ago   Exited (0) About a minute ago             blissful_liskov

PS C:\Users\Administrator> docker rm 4aced3bb72dd
4aced3bb72dd

PRs:

  • Updating

We will be better of with the servercore option rather than the nanoserver, as the latter lack of several core components, while the servercore is just a headless version of the normal server base of Windows.

https://techcommunity.microsoft.com/t5/containers/nano-server-x-server-core-x-server-which-base-image-is-the-right/ba-p/2835785

Issues in the windows docker that is currently not able to solve to make it the same as AMI:
Move-Item : Access to the path is denied.

moby/moby#38256
microsoft/Windows-Containers#147

Just able to confirm that I am using --isolation=process not --isolation=hyperv.

Able to resolve the move issue by just using mingw and force the mv happens by bash.exe.

bash.exe -c "mv -v 'C:\\Windows\\System32\\find.exe' 'C:\\Windows\\System32\\find_windows.exe'"

renamed 'C:\Windows\System32\find.exe' -> 'C:\Windows\System32\find_windows.exe'

Seems like issue with volta on 1.1.1: volta-cli/volta#1435

Will revert to either the older 1.0.8 or 1.1.0 now.

Thanks.

Able to invoke bash.exe directly in the windows container and able to run test workflow:

ContainerAdministrator@44082dfc4844 MINGW64 /c
$ whoami
ContainerAdministrator


New issues:

windows [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate

Tried many methods including install pip-system-certs, scoop install cacerts, install certifi, manually push mozilla ca certs to the certifi certs, export REQUESTS_CA_BUNDLE, etc.

Right now the only method that seems working is using curl to pull the zip once such as curl https://ci.opensearch.org/ci/dbc/distribution-build-opensearch/2.9.0/8184/windows/x64/zip/dist/opensearch/opensearch-2.9.0-windows-x64.zip -o test.sh so the cloudfront public cert is being added once to the certifi certs or system ca cert bundle, then the python requests package within the windows docker container will able to do ssl verification correctly.

Very weird and probably I missed something here. Thanks.

New way is supported to run correctly but still not, just curl ci.opensearch.org for now as it is stable:



ContainerAdministrator@0062c7841faa MINGW64 ~/opensearch-build-peterzhuamazon (windows-docker-setups-2)
$ openssl s_client -connect ci.opensearch.org:443 </dev/null | openssl x509 -outform PEM > certificate2.crt
depth=2 C = US, O = Amazon, CN = Amazon Root CA 1
verify return:1
depth=1 C = US, O = Amazon, CN = Amazon RSA 2048 M01
verify return:1
depth=0 CN = ci.opensearch.org
verify return:1
DONE

ContainerAdministrator@0062c7841faa MINGW64 ~/opensearch-build-peterzhuamazon (windows-docker-setups-2)
$ vi certificate2.crt

ContainerAdministrator@0062c7841faa MINGW64 ~/opensearch-build-peterzhuamazon (windows-docker-setups-2)
$ certutil -addstore CA certificate2.crt
CA "Intermediate Certification Authorities"
Certificate "ci.opensearch.org" added to store.
CertUtil: -addstore command completed successfully.

Seeing issues on the windows integTest with Zelin fix now suddenly:
win-integtest-issues20230807.txt

Reproduced on new windows ec2 server:
linux-windows-knn.log

Trying to push the image of windows docker to dockerhub:

Administrator@EC2AMAZ-6B9Q6PN MINGW64 ~/opensearch-build ((2.9.0))
$ docker images
REPOSITORY                             TAG                            IMAGE ID       CREATED       SIZE
opensearchstaging/ci-runner            testwindowsagain2              86fe8dda8a9f   2 days ago    8.9GB
opensearchstaging/ci-runner            windows2019-servercore-test1   86fe8dda8a9f   2 days ago    8.9GB
opensearchstaging/ci-runner            testwindows2019-user           244f0de9c472   12 days ago   10.5GB
opensearchstaging/ci-runner            testwindows2019                3c134658ffbb   13 days ago   10.6GB
mcr.microsoft.com/windows/servercore   ltsc2019                       67667e0b9c95   4 weeks ago   4.38GB
mcr.microsoft.com/windows/nanoserver   ltsc2019                       82ef3885248c   4 weeks ago   252MB

Administrator@EC2AMAZ-6B9Q6PN MINGW64 ~/opensearch-build ((2.9.0))
$ docker push opensearchstaging/ci-runner:windows2019-servercore-test1
The push refers to repository [docker.io/opensearchstaging/ci-runner]
10d36872fef9: Pushed
c7c5acd32d49: Pushed
33b0605bff63: Pushed
17a1ee0cab1d: Pushed
701fc89ba113: Pushed
d9b90de3477f: Pushing [=======>                                           ]    720MB/4.51GB
bd22b31d5d10: Pushed
325c8c82006f: Pushed
84079ad09eb0: Pushed
da2d874340bd: Pushing [===================>                               ]  1.438GB/3.666GB

I have noticed that the error of windows issues with permission is related to running through build repo code, not when you trying to directly run it within k-NN repo.

Somehow the new windows AMI is creating two user folder:
Administrator vs Administrator.EC2AMAZ<>.

The second user used to be only available when logging in through ssm or rdp, but not affecting the actual Administrator user content.

However, it is now split the installation on both account suddenly, and even if you login as Administrator on RDP it will default you to Administrator.EC2AMAZ<> user.

It is possible to be caused either by new AMI provided by EC2, or git bash(?), not sure. Testing building the old code now to confirm this behavior.

Seems like it is either caused by docker pkg, or hyperv, or bcedit setups.
Testing one by one in building a new image for each now.
Thanks.

The above issue all caused by this command:

dockerd --register-service

Seems like if I login as Administrator on rdp this will not harm.
But if I run this through powershell script it will split the docker part into a secondary owner then move all the other things to the new owner and only keep dockerd itself in Administrator.

This can still be resolved by just running the registration during startup time of the runner.

Resolved by using init script and avoid embedding service registration in packer scripts.

echo %USERNAME% && START /MIN dockerd && timeout 5 && docker ps

The docker host and build of Windows Runner is up and running on staging Jenkins now:


Executing init script

C:\Users\Administrator>echo Administrator   && dockerd --register-service   && net start docker   && echo "started docker deamon"   && docker ps 
Administrator 
The Docker Engine service is starting.
The Docker Engine service was started successfully.

"started docker deamon" 
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
init script ran successfully
remoting.jar sent remotely. Bootstrapping it
Launching via WinRM:java  -jar C:\Windows\Temp\remoting.jar -workDir C:/Users/Administrator/jenkins
<===[JENKINS REMOTING CAPACITY]===>Remoting version: 3107.v665000b_51092
Launcher: EC2WindowsLauncher
Communication Protocol: Standard in/out
This is a Windows agent

I have confirmed as of docker copy the linux runner is able to copy over windows docker images across registries:
https://build.ci.opensearch.org/job/docker-copy/664/console

Able to build windows container within windows container just like linux docker in docker:
"//./pipe/docker_engine://./pipe/docker_engine"


PS C:\Users\ContainerAdministrator\opensearch-build\docker\ci> bash

ContainerAdministrator@0d2f4f1f80e9 MINGW64 ~/opensearch-build/docker/ci (main)
$ ./build-image-single-arch.sh -r ci-runner -v windows2019-servercore-test2 -f dockerfiles/current/build.windows2019.ser
vercore.x64.dockerfile
windows2019-servercore-test2 dockerfiles/current/build.windows2019.servercore.x64.dockerfile
Sending build context to Docker daemon  116.7kB
Step 1/10 : ARG ServerCoreRepo=mcr.microsoft.com/windows/servercore
Step 2/10 : FROM ${ServerCoreRepo}:ltsc2019
 ---> 67667e0b9c95
Step 3/10 : USER ContainerAdministrator
 ---> Using cache
 ---> 24b3e060da38
Step 4/10 : COPY config/windows-servercore-setup.ps1 ./
 ---> 8ded082f13d6
Step 5/10 : RUN powershell ./windows-servercore-setup.ps1
 ---> Running in 31535c4f165e
Initializing...
Downloading ...
Extracting...
Creating shim...
Adding ~\scoop\shims to your path.
Scoop was installed successfully!

Startup time of the windows agent is now reduced from 15-17 min or so to 5-7 min, see new vs old in the log:

windows startup timelog
EC2 (Amazon_ec2_cloud) - jenkinsAgentNode-Jenkins-Agent-Windows2019-X64-C54xlarge-Single-Host (i-0c08d63c7762f8c03) booted at 1692304351000
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Connecting to (10.0.104.151) with WinRM as Administrator
Connected with WinRM.
Creating tmp directory if it does not exist
Executing init script

C:\Users\Administrator>echo Administrator   && dockerd --register-service     && net start docker   && echo "started docker deamon"   && docker ps 
Administrator 
The Docker Engine service is starting.
The Docker Engine service was started successfully.

"started docker deamon" 
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
init script ran successfully
remoting.jar sent remotely. Bootstrapping it
Launching via WinRM:java  -jar C:\Windows\Temp\remoting.jar -workDir C:/Users/Administrator/jenkins
<===[JENKINS REMOTING CAPACITY]===>Remoting version: 3107.v665000b_51092
Launcher: EC2WindowsLauncher
Communication Protocol: Standard in/out
This is a Windows agent

EC2 (Amazon_ec2_cloud) - jenkinsAgentNode-Jenkins-Agent-Windows2019-X64-C524xlarge-Single-Host (i-07b6a8616ae04e880) booted at 1692299057000
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Connecting to (10.0.110.144) with WinRM as Administrator
Waiting for WinRM to come up. Sleeping 10s.
Waiting for WinRM to come up. Sleeping 10s.
Waiting for WinRM to come up. Sleeping 10s.
Waiting for WinRM to come up. Sleeping 10s.
Connected with WinRM.
Creating tmp directory if it does not exist
Executing init script

C:\Users\Administrator>echo
ECHO is on.
init script ran successfully
remoting.jar sent remotely. Bootstrapping it
Launching via WinRM:java -jar C:\Windows\Temp\remoting.jar -workDir C:/Users/Administrator/jenkins
<===[JENKINS REMOTING CAPACITY]===>Remoting version: 3107.v665000b_51092
Launcher: EC2WindowsLauncher
Communication Protocol: Standard in/out
This is a Windows agent


</details>

Runs good with docker build support on production now:
https://build.ci.opensearch.org/job/docker-build/3629/console

Windows image extraction is very slow, needs to use pigz to increase the pull speed.

Windows does not have pigz installation on scoop, needs to install the binary directly.

Seems like MOBY_DISABLE_PIGZ is used to disable unpigz behavior, so it should be enabled by default.

So pigz seems only runs if you put in root of C: with dir like C:\pigz and put into machine env vars.
It seems like pigz only saves time when the extraction is happening, but the most time wasted is after the extraction.
It does not seems that pigz will help improve time that much:

Without pigz:

$ time docker pull opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
ci-runner-windows2019-servercore-opensearch-build-v1: Pulling from opensearchstaging/ci-runner
c9226d61d3bd: Already exists
b95f433aa7d9: Pull complete
00e36bb1af6a: Pull complete
96b3ca42606a: Pull complete
eba42434ce94: Pull complete
69c589335db3: Pull complete
0ec633f2f60c: Pull complete
21200ab93e1b: Pull complete
bc161862b081: Pull complete
c65a5ac1ea31: Pull complete
Digest: sha256:b6ba005996340062f68137fe7cf3e17cd3d61bdb9a5df944f276905df795dd0e
Status: Downloaded newer image for opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
docker.io/opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1

real    12m39.993s
user    0m0.000s
sys     0m0.015s

With pigz:


$ time docker pull opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
ci-runner-windows2019-servercore-opensearch-build-v1: Pulling from opensearchstaging/ci-runner
c9226d61d3bd: Already exists
b95f433aa7d9: Pull complete
00e36bb1af6a: Pull complete
96b3ca42606a: Pull complete
eba42434ce94: Pull complete
69c589335db3: Pull complete
0ec633f2f60c: Pull complete
21200ab93e1b: Pull complete
bc161862b081: Pull complete
c65a5ac1ea31: Pull complete
Digest: sha256:b6ba005996340062f68137fe7cf3e17cd3d61bdb9a5df944f276905df795dd0e
Status: Downloaded newer image for opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
docker.io/opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1

real    12m29.576s
user    0m0.015s
sys     0m0.000s

Saved 10 seconds.

Some more test:

Without pigz

$ time docker pull opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
ci-runner-windows2019-servercore-opensearch-build-v1: Pulling from opensearchstaging/ci-runner
c9226d61d3bd: Already exists
b95f433aa7d9: Pull complete
00e36bb1af6a: Pull complete
96b3ca42606a: Pull complete
eba42434ce94: Pull complete
69c589335db3: Pull complete
0ec633f2f60c: Pull complete
21200ab93e1b: Pull complete
bc161862b081: Pull complete
c65a5ac1ea31: Pull complete
Digest: sha256:b6ba005996340062f68137fe7cf3e17cd3d61bdb9a5df944f276905df795dd0e
Status: Downloaded newer image for opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
docker.io/opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1

real    5m34.866s
user    0m0.000s
sys     0m0.015s

With pigz:


$ time  docker pull opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
ci-runner-windows2019-servercore-opensearch-build-v1: Pulling from opensearchstaging/ci-runner
c9226d61d3bd: Already exists
b95f433aa7d9: Pull complete
00e36bb1af6a: Pull complete
96b3ca42606a: Pull complete
eba42434ce94: Pull complete
69c589335db3: Pull complete
0ec633f2f60c: Pull complete
21200ab93e1b: Pull complete
bc161862b081: Pull complete
c65a5ac1ea31: Pull complete
Digest: sha256:b6ba005996340062f68137fe7cf3e17cd3d61bdb9a5df944f276905df795dd0e
Status: Downloaded newer image for opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1
docker.io/opensearchstaging/ci-runner:ci-runner-windows2019-servercore-opensearch-build-v1

real    4m45.069s
user    0m0.000s
sys     0m0.016s

git clone now on the windows host is instant on build repo.

There is a bug right now that every time when we pull the image from fresh it will always fail once on the sh stage.
I suspect we need to pre-load the image on the runner beforehand.
It will goes to success soon after in the second rerun:

ERROR: script returned exit code 127

Add a docker image initialization step on Windows Docker Host to resolve above issues.

Add new integTest support with Windows container now.