widdix/aws-cf-templates

Jenkins Agents always install the latest docker instead of pinned version

timharsch opened this issue · 4 comments

TemplateID: jenkins/jenkins2-ha-agents
Region: us-east-2
(describe your issue here)

TLDR?; skip to the last paragraph

I've been running the jenkins template for daily builds without issue for last 6 months. One day I was surprised to see our builds start failing. The issue was when we start a build using docker - the containers would build, but running a command would hang. I traced the problem down to amazon releasing a new version of docker in amazon-linux-extras. If I manually logged into a node and downgraded to the latest version before the incident via sudo yum downgrade docker-19.03.13ce-1.amzn2

This fixed the issue and builds are functioning again so long as that node remains up. Next task is to figure out how to get the agents to use the pinned version, so it is available on restarts. I found this line in the template:

command: 'amazon-linux-extras enable docker=18.06.1'

Changing that version however did not help. I also found that my agents are having docker version '20.10.4-1' installed despite the line's declaration of version 18. As can be seen in this workflow I did locally:

sh-4.2$ sudo amazon-linux-extras enable docker=18.06.1 | grep docker
 20  docker=18.06.1           enabled      \
 # yum install docker
sh-4.2$ sudo yum install docker
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2extra-corretto8                                                                                                                                    | 3.0 kB  00:00:00
amzn2extra-docker                                                                                                                                       | 1.3 kB  00:00:00
Not using downloaded amzn2extra-docker/repomd.xml because it is older than what we have:
  Current   : Tue Apr 13 20:43:14 2021
  Downloaded: Tue Jul 16 20:44:06 2019
Resolving Dependencies
--> Running transaction check
---> Package docker.x86_64 0:20.10.4-1.amzn2 will be installed
--> Processing Dependency: runc >= 1.0.0 for package: docker-20.10.4-1.amzn2.x86_64
--> Processing Dependency: libcgroup >= 0.40.rc1-5.15 for package: docker-20.10.4-1.amzn2.x86_64
--> Processing Dependency: containerd >= 1.3.2 for package: docker-20.10.4-1.amzn2.x86_64
--> Processing Dependency: pigz for package: docker-20.10.4-1.amzn2.x86_64
--> Running transaction check
---> Package containerd.x86_64 0:1.4.4-1.amzn2 will be installed
---> Package libcgroup.x86_64 0:0.41-21.amzn2 will be installed
---> Package pigz.x86_64 0:2.3.4-1.amzn2.0.1 will be installed
---> Package runc.x86_64 0:1.0.0-0.1.20210225.git12644e6.amzn2 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===============================================================================================================================================================================
 Package                            Arch                           Version                                                     Repository                                 Size
===============================================================================================================================================================================
Installing:
 docker                             x86_64                         20.10.4-1.amzn2                                             amzn2extra-docker                          32 M
Installing for dependencies:
 containerd                         x86_64                         1.4.4-1.amzn2                                               amzn2extra-docker                          24 M
 libcgroup                          x86_64                         0.41-21.amzn2                                               amzn2-core                                 66 k
 pigz                               x86_64                         2.3.4-1.amzn2.0.1                                           amzn2-core                                 81 k
 runc                               x86_64                         1.0.0-0.1.20210225.git12644e6.amzn2                         amzn2extra-docker                         3.2 M

Transaction Summary
===============================================================================================================================================================================
Install  1 Package (+4 Dependent packages)

Total download size: 59 M
Installed size: 243 M
Is this ok [y/d/N]:

Note the last three lines of amazon-linux-extras enable docker=18.06.1 command:

...
Now you can install:
 # yum clean metadata
 # yum install docker

If I add the missing step prescribed in the final three lines of the output, it works:

sh-4.2$ sudo yum clean metadata
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
Cleaning repos: amzn2-core amzn2extra-corretto8 amzn2extra-docker nodesource yarn
17 metadata files removed
10 sqlite files removed
0 metadata files removed
sh-4.2$ sudo yum install docker
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2-core                                                                                                                                              | 3.7 kB  00:00:00
amzn2extra-corretto8                                                                                                                                    | 3.0 kB  00:00:00
amzn2extra-docker                                                                                                                                       | 1.3 kB  00:00:00
nodesource                                                                                                                                              | 2.5 kB  00:00:00
yarn                                                                                                                                                    | 2.9 kB  00:00:00
(1/8): amzn2-core/2/x86_64/group_gz                                                                                                                     | 2.5 kB  00:00:00
(2/8): amzn2-core/2/x86_64/updateinfo                                                                                                                   | 367 kB  00:00:00
(3/8): amzn2extra-corretto8/2/x86_64/primary_db                                                                                                         |  79 kB  00:00:00
(4/8): amzn2extra-corretto8/2/x86_64/updateinfo                                                                                                         |   76 B  00:00:00
(5/8): nodesource/x86_64/primary_db                                                                                                                     |  39 kB  00:00:00
(6/8): yarn/primary_db                                                                                                                                  |  22 kB  00:00:00
(7/8): amzn2extra-docker/2/x86_64/primary_db                                                                                                            |  56 kB  00:00:00
(8/8): amzn2-core/2/x86_64/primary_db                                                                                                                   |  52 MB  00:00:00
Resolving Dependencies
--> Running transaction check
---> Package docker.x86_64 0:18.06.1ce-10.amzn2 will be installed
--> Processing Dependency: pigz for package: docker-18.06.1ce-10.amzn2.x86_64
--> Processing Dependency: libcgroup for package: docker-18.06.1ce-10.amzn2.x86_64
--> Processing Dependency: libltdl.so.7()(64bit) for package: docker-18.06.1ce-10.amzn2.x86_64
--> Running transaction check
---> Package libcgroup.x86_64 0:0.41-21.amzn2 will be installed
---> Package libtool-ltdl.x86_64 0:2.4.2-22.2.amzn2.0.2 will be installed
---> Package pigz.x86_64 0:2.3.4-1.amzn2.0.1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===============================================================================================================================================================================
 Package                                 Arch                              Version                                          Repository                                    Size
===============================================================================================================================================================================
Installing:
 docker                                  x86_64                            18.06.1ce-10.amzn2                               amzn2extra-docker                             37 M
Installing for dependencies:
 libcgroup                               x86_64                            0.41-21.amzn2                                    amzn2-core                                    66 k
 libtool-ltdl                            x86_64                            2.4.2-22.2.amzn2.0.2                             amzn2-core                                    49 k
 pigz                                    x86_64                            2.3.4-1.amzn2.0.1                                amzn2-core                                    81 k

Transaction Summary
===============================================================================================================================================================================
Install  1 Package (+3 Dependent packages)

Total download size: 37 M
Installed size: 151 M
Is this ok [y/d/N]:

So I think the issue could be solved by finding the spot in the template where the yum clean metadata can be performed. That would help me get the docker version 18 to be installed on new agents. I'd also like to figure out how to bring it to the latest I can use, which is version 19.03.13ce-1 at this time (at least until the issue with version 20 of docker can be found). I can't seem to find the correct version to list in the enable command though - amazon-linux-extras enable docker=docker-19.03.13ce-1 doesn't seem work, the version can't be found.

I believe I found the problem. The issue was that the test for docker was failing because the section '[amzn2extra-docker]' was already in amzn2-extras.repo and was set to latest (a dependency of some prior tool? enabled by default? changed in some new version of amazon linux since the template was written?). So the test section fails and the command is never run. I removed the test line and added the yum clean metadata line to the command, and now version 18.06.1 is installed on the nodes. Here's a diff:

             'a_enable_docker':
-              command: 'amazon-linux-extras enable docker=18.06.1'
-              test: "! grep -Fxq '[amzn2extra-docker]' /etc/yum.repos.d/amzn2-extras.repo"
+              command: 'amazon-linux-extras enable docker=18.06.1; yum clean metadata'

It helped once I found that logs are in /var/log/cfn-init.log and I noticed this:

021-05-02 16:21:07,651 P3079 [INFO] Test for Command a_enable_docker
2021-05-02 16:21:07,654 P3079 [ERROR] Exited with error code 1
2021-05-02 16:21:07,655 P3079 [INFO] ============================================================
2021-05-02 16:21:07,655 P3079 [INFO] Test for Command b_enable_corretto8
2021-05-02 16:21:07,658 P3079 [INFO] Completed successfully.
2021-05-02 16:21:07,658 P3079 [INFO] ============================================================
2021-05-02 16:21:07,658 P3079 [INFO] Command b_enable_corretto8
2021-05-02 16:21:07,891 P3079 [INFO] -----------------------Command Output-----------------------
2021-05-02 16:21:07,891 P3079 [INFO]      0  ansible2                 available    \
2021-05-02 16:21:07,891 P3079 [INFO]            [ =2.4.2  =2.4.6  =2.8  =stable ]
...
2021-05-02 16:21:07,892 P3079 [INFO]     20  docker=latest            enabled      \
2021-05-02 16:21:07,892 P3079 [INFO]            [ =17.12.1  =18.03.1  =18.06.1  =18.09.9  =stable ]

NOTE: that for some reason 18.09.9 would not install for me because the source could not be found for some reason, so 18.06.1 is the most recent version of docker available using this method of installation. I would have liked to use 19.03.13ce-1 as well, but that is not a valid option for amazon-linux-extras. An alternative may be to use amazon-linux-extras enable docker and then add the version to the line where yum does the install, but unfortunately, I'm out of time for further exploration into this issue.

Thanks for the bug report. I'm working on a fix.

@timharsch could you confirm that the fix in #551 fixes your issue?

I can't confirm, because I don't run the latest jenkins template, but a modified version of one some number of releases back. I did a review of your PR though and from a static analysis viewpoint it looks fine to me.