Jenkins Agents always install the latest docker instead of pinned version
timharsch opened this issue · 4 comments
TemplateID: jenkins/jenkins2-ha-agents
Region: us-east-2
(describe your issue here)
TLDR?; skip to the last paragraph
I've been running the jenkins template for daily builds without issue for last 6 months. One day I was surprised to see our builds start failing. The issue was when we start a build using docker - the containers would build, but running a command would hang. I traced the problem down to amazon releasing a new version of docker in amazon-linux-extras. If I manually logged into a node and downgraded to the latest version before the incident via sudo yum downgrade docker-19.03.13ce-1.amzn2
This fixed the issue and builds are functioning again so long as that node remains up. Next task is to figure out how to get the agents to use the pinned version, so it is available on restarts. I found this line in the template:
aws-cf-templates/jenkins/jenkins2-ha-agents.yaml
Line 1780 in 714d8b9
Changing that version however did not help. I also found that my agents are having docker version '20.10.4-1' installed despite the line's declaration of version 18. As can be seen in this workflow I did locally:
sh-4.2$ sudo amazon-linux-extras enable docker=18.06.1 | grep docker
20 docker=18.06.1 enabled \
# yum install docker
sh-4.2$ sudo yum install docker
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2extra-corretto8 | 3.0 kB 00:00:00
amzn2extra-docker | 1.3 kB 00:00:00
Not using downloaded amzn2extra-docker/repomd.xml because it is older than what we have:
Current : Tue Apr 13 20:43:14 2021
Downloaded: Tue Jul 16 20:44:06 2019
Resolving Dependencies
--> Running transaction check
---> Package docker.x86_64 0:20.10.4-1.amzn2 will be installed
--> Processing Dependency: runc >= 1.0.0 for package: docker-20.10.4-1.amzn2.x86_64
--> Processing Dependency: libcgroup >= 0.40.rc1-5.15 for package: docker-20.10.4-1.amzn2.x86_64
--> Processing Dependency: containerd >= 1.3.2 for package: docker-20.10.4-1.amzn2.x86_64
--> Processing Dependency: pigz for package: docker-20.10.4-1.amzn2.x86_64
--> Running transaction check
---> Package containerd.x86_64 0:1.4.4-1.amzn2 will be installed
---> Package libcgroup.x86_64 0:0.41-21.amzn2 will be installed
---> Package pigz.x86_64 0:2.3.4-1.amzn2.0.1 will be installed
---> Package runc.x86_64 0:1.0.0-0.1.20210225.git12644e6.amzn2 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===============================================================================================================================================================================
Package Arch Version Repository Size
===============================================================================================================================================================================
Installing:
docker x86_64 20.10.4-1.amzn2 amzn2extra-docker 32 M
Installing for dependencies:
containerd x86_64 1.4.4-1.amzn2 amzn2extra-docker 24 M
libcgroup x86_64 0.41-21.amzn2 amzn2-core 66 k
pigz x86_64 2.3.4-1.amzn2.0.1 amzn2-core 81 k
runc x86_64 1.0.0-0.1.20210225.git12644e6.amzn2 amzn2extra-docker 3.2 M
Transaction Summary
===============================================================================================================================================================================
Install 1 Package (+4 Dependent packages)
Total download size: 59 M
Installed size: 243 M
Is this ok [y/d/N]:
Note the last three lines of amazon-linux-extras enable docker=18.06.1
command:
...
Now you can install:
# yum clean metadata
# yum install docker
If I add the missing step prescribed in the final three lines of the output, it works:
sh-4.2$ sudo yum clean metadata
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
Cleaning repos: amzn2-core amzn2extra-corretto8 amzn2extra-docker nodesource yarn
17 metadata files removed
10 sqlite files removed
0 metadata files removed
sh-4.2$ sudo yum install docker
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2-core | 3.7 kB 00:00:00
amzn2extra-corretto8 | 3.0 kB 00:00:00
amzn2extra-docker | 1.3 kB 00:00:00
nodesource | 2.5 kB 00:00:00
yarn | 2.9 kB 00:00:00
(1/8): amzn2-core/2/x86_64/group_gz | 2.5 kB 00:00:00
(2/8): amzn2-core/2/x86_64/updateinfo | 367 kB 00:00:00
(3/8): amzn2extra-corretto8/2/x86_64/primary_db | 79 kB 00:00:00
(4/8): amzn2extra-corretto8/2/x86_64/updateinfo | 76 B 00:00:00
(5/8): nodesource/x86_64/primary_db | 39 kB 00:00:00
(6/8): yarn/primary_db | 22 kB 00:00:00
(7/8): amzn2extra-docker/2/x86_64/primary_db | 56 kB 00:00:00
(8/8): amzn2-core/2/x86_64/primary_db | 52 MB 00:00:00
Resolving Dependencies
--> Running transaction check
---> Package docker.x86_64 0:18.06.1ce-10.amzn2 will be installed
--> Processing Dependency: pigz for package: docker-18.06.1ce-10.amzn2.x86_64
--> Processing Dependency: libcgroup for package: docker-18.06.1ce-10.amzn2.x86_64
--> Processing Dependency: libltdl.so.7()(64bit) for package: docker-18.06.1ce-10.amzn2.x86_64
--> Running transaction check
---> Package libcgroup.x86_64 0:0.41-21.amzn2 will be installed
---> Package libtool-ltdl.x86_64 0:2.4.2-22.2.amzn2.0.2 will be installed
---> Package pigz.x86_64 0:2.3.4-1.amzn2.0.1 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===============================================================================================================================================================================
Package Arch Version Repository Size
===============================================================================================================================================================================
Installing:
docker x86_64 18.06.1ce-10.amzn2 amzn2extra-docker 37 M
Installing for dependencies:
libcgroup x86_64 0.41-21.amzn2 amzn2-core 66 k
libtool-ltdl x86_64 2.4.2-22.2.amzn2.0.2 amzn2-core 49 k
pigz x86_64 2.3.4-1.amzn2.0.1 amzn2-core 81 k
Transaction Summary
===============================================================================================================================================================================
Install 1 Package (+3 Dependent packages)
Total download size: 37 M
Installed size: 151 M
Is this ok [y/d/N]:
So I think the issue could be solved by finding the spot in the template where the yum clean metadata
can be performed. That would help me get the docker version 18 to be installed on new agents. I'd also like to figure out how to bring it to the latest I can use, which is version 19.03.13ce-1
at this time (at least until the issue with version 20 of docker can be found). I can't seem to find the correct version to list in the enable command though - amazon-linux-extras enable docker=docker-19.03.13ce-1
doesn't seem work, the version can't be found.
I believe I found the problem. The issue was that the test for docker was failing because the section '[amzn2extra-docker]' was already in amzn2-extras.repo and was set to latest (a dependency of some prior tool? enabled by default? changed in some new version of amazon linux since the template was written?). So the test section fails and the command is never run. I removed the test line and added the yum clean metadata line to the command, and now version 18.06.1 is installed on the nodes. Here's a diff:
'a_enable_docker':
- command: 'amazon-linux-extras enable docker=18.06.1'
- test: "! grep -Fxq '[amzn2extra-docker]' /etc/yum.repos.d/amzn2-extras.repo"
+ command: 'amazon-linux-extras enable docker=18.06.1; yum clean metadata'
It helped once I found that logs are in /var/log/cfn-init.log
and I noticed this:
021-05-02 16:21:07,651 P3079 [INFO] Test for Command a_enable_docker
2021-05-02 16:21:07,654 P3079 [ERROR] Exited with error code 1
2021-05-02 16:21:07,655 P3079 [INFO] ============================================================
2021-05-02 16:21:07,655 P3079 [INFO] Test for Command b_enable_corretto8
2021-05-02 16:21:07,658 P3079 [INFO] Completed successfully.
2021-05-02 16:21:07,658 P3079 [INFO] ============================================================
2021-05-02 16:21:07,658 P3079 [INFO] Command b_enable_corretto8
2021-05-02 16:21:07,891 P3079 [INFO] -----------------------Command Output-----------------------
2021-05-02 16:21:07,891 P3079 [INFO] 0 ansible2 available \
2021-05-02 16:21:07,891 P3079 [INFO] [ =2.4.2 =2.4.6 =2.8 =stable ]
...
2021-05-02 16:21:07,892 P3079 [INFO] 20 docker=latest enabled \
2021-05-02 16:21:07,892 P3079 [INFO] [ =17.12.1 =18.03.1 =18.06.1 =18.09.9 =stable ]
NOTE: that for some reason 18.09.9 would not install for me because the source could not be found for some reason, so 18.06.1 is the most recent version of docker available using this method of installation. I would have liked to use 19.03.13ce-1
as well, but that is not a valid option for amazon-linux-extras. An alternative may be to use amazon-linux-extras enable docker
and then add the version to the line where yum does the install, but unfortunately, I'm out of time for further exploration into this issue.
Thanks for the bug report. I'm working on a fix.
@timharsch could you confirm that the fix in #551 fixes your issue?
I can't confirm, because I don't run the latest jenkins template, but a modified version of one some number of releases back. I did a review of your PR though and from a static analysis viewpoint it looks fine to me.