Instability when updating repo and using repo_gpgcheck
sdherr opened this issue · 5 comments
Description
There currently exists a timing window that can lead to client errors if you are using signed repo metadata with repo_gpgcheck=1
. When you are using that option of course the server provides both the normal repomd.xml
file and also a detached signature of it in repomd.xml.asc
. Even if the server swaps out both files atomically on a repo update, an unlucky client can have received the old version of repomd.xml
and the new version of repomd.xml.asc
, if its requests happened to come on either side the update instant. The signature will then fail to verify and the client will return an error. Furthermore if the server is using any type of caching system or CDN, this mis-match can get persisted in the cache up to the TTL of the files if you are unlucky, causing many client errors and making the repo essentially unavailable for a period of time.
Problem
Unlike the other metadata files, there is no hash information in the name of repomd.xml.asc
, or any other indication of what version of the file you want. This is likely because there is a chicken-and-egg problem of writing this info into repomd.xml
: the file must exist first before you can sign it or calculate its hash, and the act of adding that information back in to repomd.xml
invalidates the signature or changes the hash.
I furthermore assume that this is a detached signature in the first place instead of simply clearsigning repomd.xml
to keep it one atomic file because there were worries about backwards-incompatibility with old clients.
Solution
Any change which would indicate which version of repomd.xml.asc
that you need to the server is a potential fix, but I think there are three main possibilities:
- Changing the name of the file like you do with other metadata files, with the name calculated client-side somehow.
- Adding a header to the request.
- Adding a query parameter to the request.
In particular for backwards compatibility and ease-of-update, I suggest we do number 3, simply take the Last-Modified
header value on the repomd.xml
response, and add it as a query parameter on the repomd.xml.asc
request: .../repomd.xml.asc?repomd_last_modified=<header value>
.
- Almost every webserver simply ignores extra unexpected query params, so this should not break anything. On the off chance that it does and a server responds with a
4xx
HTTP code, DNF could simply revert back to today's behavior and request the file without the query param. - By contrast, caches / CDNs typically do include the query params as part of the cache key by default.
By including the information on which repomd.xml file you have in the cache key for repomd.xml.asc
we are creating different cached versions of the file, and the client will automatically receive the correct one and proceed successfully if these cache keys are already populated. If there is no cache involved and a server is responding to every request directly, at least they are in no worse position than they were before, and they can teach their server to respond appropriately to the query param if they wish.
There is still a possible timing window here where bad pairs can get cached if both cache TTLs time out and a client re-fills the cache with an old version of repomd.xml
and a new version of repomd.xml.asc?repomd_last_modified=123
from a server that is ignoring the query string. However this gives servers the ability to close this gap permanently and respond correctly to every request, by doing either:
- Teach the webserver to respond correctly to the query param, or
- Increase the cache TTL on
*/repomd.xml.asc?repomd_last_modified=*
to more than double the TTL ofrepomd.xml
, eliminating the overlapping timing window where it's possible to cache incompatible versions of the files.
Related
I have filed similar bugs against tdnf
and zypper
:
vmware/tdnf#475
openSUSE/zypper#549
There also appears to be discussion about just in-lining the signature in repomd.xml
and dropping repomd.xml.asc
, which would be an alternate way to solve this problem:
rpm-software-management/rpm-metadata#1
I have looked at the problem and propose 2 solutions. They both try for forward and backward compatibility.
1. Detached signature solution:
The repomd.xml
file contains a revision
element.
By default, this element contains UNIX time. Example: <revision>1721803795</revision>
We name the signature file <revision>-repomd.xml.asc
. So for the above example, 1721803795-repomd.xml.asc
.
Compatibility:
If the new file does not exist (old repository), the client will request the currently used repomd.xml.asc
file as a fallback -> The new client will work with the old repository.
The new repositories will continue to distribute the repomd.xml.asc
file. So the old client will work with the new repository.
Note:
The user can define a custom value for the <revision>
element (createrepo_c
argument --revision=<something>
), but as long as the revision is unique to a particular version/revision of the repomd.xml file it will work. In the worst case, we get the current situation.
2. Solution with inline signature
The signature will be added to the end of the repomd.xml
file as a comment containing the signature element.
Example:
<?xml version="1.0" encoding="UTF-8"?>
<repomd xmlns="http://linux.duke.edu/metadata/repo" xmlns:rpm="http://linux.duke.edu/metadata/rpm">
<revision>1721764685</revision>
...
</repomd>
<!--
<pgp_signature>
iQIzBAABCAAdFiEEpfSDzXM6TrrqN4sq6Il5+5swrPIFAmagC04ACgkQ6Il5+5sw
rPJe0Q//dm23fki6M6GZkV+oeVf+u9ZrCrLNfVXn1qN28bJUnV31azui0DBMoYYs
3RYGzUjBU6dgruWI5RjCxV1Icfs/7DWRvFSLhj9BO490X3nEXB9OaMjvaEL/NPnf
yQzQZ8PITYyxWSdimY/Kmdj+0hLk6NYqPwNZVbfa3+0KystWmVNcKkpdTMuBr4Gh
xXUDMasE9gTQQbiz3XFoG8MLrgAjyTL6FdGuuioN/Saq0c8zQ9G2do9jQNBhHy3B
7D2bK5s+JWIaKW95w8ASC11mx+1KHqA523+7ElI4ihY4mENCaVlYbpBlEPWP+4eq
gK/5Njnf1IVix0DtX69C9JFzC4bD/kEHqVNoUJuAtt66YuHd8lzjEHNRr95iqvwx
317+UJxqT0tFDbIaMrlWozufl8stY1+aNsPY2196k13vI7CqAhF3G1X5qH5rnFqA
mTI7YnxbOnCpVLNc0Y/TdSRuXX9r9H1Rj4edMG6Mcc9fnShGFwAcvKI5hma1+aj6
cTRHfq8PSHcX93+jVk3fpnJIJJ7WFkzcLEDmI5+Pin2V5lcqdQa6wrirvcnND75+
LPqKyJc398KrC8iGBW1AzRMbdj+nm4krPr/9ymZeayeWL4OMmEbCzXo/lBQfPpC2
aqldYWQIxpDyfgnXSK5IKg+51p1KA+dMUiI3DqSiQ8o8in/wPXY=
=KUBi
</pgp_signature>
-->
Compatibility:
If the comment with the signature does not exist (old repository), the client will request the currently used repomd.xml.asc
file as a fallback -> The new client will work with the old repository.
The new repositories will continue to distribute the repomd.xml.asc
file. And the actual signature element is in the comment. So the old client finds the original signature file and ignores the comments. This ensures that the old client is functional with the new repository.
Note:
The signature in the repomd.xml.asc
file will be different from the inline signature. Because it will sign the whole repomd.xml
file including the inline signature.
I have no problem with either of those solutions. I have a slight preference for 1
, renaming the repomd.xml.asc
file to include the revision, as that has a nice similarity to the existing metadata file names and seems simpler.
I think in the 2nd solution with in-line signatures you are reinventing XML-Sig https://en.wikipedia.org/wiki/XML_Signature and believe me it's not a simple task (basically because of how physical XML model handles whitespaces, quotes etc.). I recommend detached signatures. If you felt compelled with in-line signatures, please use an existing library for XML-Sig, e.g. xmlsec.
To be honest. I also prefer the detached signature solution.
But there is an old issue rpm-software-management/rpm-metadata#1 suggesting inline signature support. So I wrote a proposal to achieve this relatively easily and compatible with older clients. Maybe I shouldn't have even mentioned this proposal.
I also expect that my proposal with detached signature may provoke criticism from some. The point is that first we need to parse repomd.xml
to get the value of the revision
element. This is needed to get the name of the signature file. So the repomd.xml
file will be parsed before verifying its signature. Which is not a problem, if we then find that the signature does not match, we will not use the file further. However, someone may say. What if the xml parsing library contains a bug and someone replaces repomd.xml
with another one exploiting this bug and we find out too late - we parse before verifying the signature.
But that's about understanding what the signature is supposed to do. Whether just to verify the content of the repodata or also to protect against exploiting bugs in the xml parsing library.
Maybe one day we will reach a point where the package manager (e.g. dnf) will be split into multiple processes and most of the work, including downloading data and processing/parsing it, will be done in a restricted unprivileged process.
If you fear parsing untrusted XML, you can save the revision value into a new, plaintext file, let's called it "latest" and download that "latest" file first. Then repomd.xml and then $REVISION-repomd.xml.asc.
However, that won't safe you from the problem described in the original report: The client could download old "latest" file, then all files would get swapped, and then the client would download new, unrelated repomd.xml and get an error when retrieving nonexistent $REVISION-repomd.xml.asc.
At any rate, the client will result into an error and will need to decide whether to try a new mirror, to retry the same mirror, or raise an error to the user.
In my opinion, the problem does not have solution because there is no reliable way of downloading all files as existed at a time. The root cause is that a repository mirroring protocol (HTTP, file) we use is not a transactional database system.
In practical way, we should rather aim at better resilience of the client. I.e. retry the downloads if the signature does not match and raise an error, or try a different mirror, after a second unsuccessful verification.
The originally proposed solution with HTTP headers and URL query parameters is too specific to HTTP transport. We need to support other protocols.