fastmail/authentication_milter

End-user data loss due to excessive "Authentication-Results" header length

jgrisham opened this issue · 0 comments

Tl;dr: very long header fields are mangled by the Outlook client; when used with an IMAP server, a quiet failure occurs resulting in significant and unpredictable data loss


Severity: end-user data loss
Category: unintended consequences

Summary: The total 'unwrapped' length of the Authentication-Results header field can exceed 1,000 characters, causing data loss when certain IMAP clients are used to access/file those messages.

The output of fastmail/authentication_milter, without the use of the 'x-' experimental / unregistered method identifiers or extension result codes, appears to only use about 75% of that amount. As currently deployed, however, field lengths in excess of 1,300 characters have been observed.

Details:

(click to expand/collapse)

Some (poorly-compliant?) e-mail clients (e.g. the current version of Microsoft Outlook for Windows and anything else using the MSFT MAPI engine) seem to be 'unwraping'/'unfolding' the header fields of internet messages for internal storage.

While RFC 5322 [1][2] limits line length in finished messages to
998 characters[3], section 2.2.3 of that RFC states that once a header field is 'unfolded' by a lexer the combination of the header name and header body may be of unlimited length.[4]

How does this cause data loss? An example:

(click to expand/collapse)
  1. A user uses Microsoft Outlook for Windows (since she is already familiar with it from work) to connect to an IMAP server at her e-mail provider (e.g. Cyrus-imapd at Fastmail).
  2. Everything initially appears to work fine; she can check her mail, read messages, send messages, and file messages into folders (manually or using offline filters).
    a. As time goes by, the syncing process, on large mailboxes, seems to slow down, but everything appears to work in the end.
    b. If she goes to the webmail interface of her e-mail provider, in general, messages she moved appear to have moved; deleted messages have been correctly deleted on the server.
  3. Unbeknownst to our hapless protagonist, some of the incoming messages have very long header fields (approx. 1,200 characters)
    a. Outlook[10.5] doesn't process these correctly, and breaks the unfolded header lines that would otherwise be greater than 998 characters when storing in its local database.
    b. These lines don't get wrapped correctly (the programmers didn't expect a single header field to be that long?), and end up breaking that single long field incorrectly [11].
    c. Since the IMAP implementation in Outlook was built upon the pre-existing MAPI database structure, and since IMAP isn't the primary use of Outlook, it does not support certain IMAP features (such as the MOVE extension)
    d. The Outlook offline message storage cache for this message now contains at least one unmatched 'newline' 0x0a character
  4. When our user files one of these messages into a different folder using Outlook, the following happens:
    a. Outlook moves the message in its internal database.
    b. Outlook connects to the IMAP server (if it is not already connected in an IDLE state)
    c. Outlook asks the IMAP server for a list of items in the original folder, then asks the server to delete the message from the original folder (since its local database says it no longer should exist there)
    d. Outlook attempts to upload the message to the new folder. The server, (cyrus-imapd) in this case, doesn't accept the body of the message since it now contains a newline (0x0a) character not paired with a matching adjacent carriage return 0x0d character [12].
    e. Outlook quietly logs a synchronization error,[13] but as far as our user is concerned all is good. Only if she were to search the webmail interface for this specific message would she realize it was not there.
  5. The final situation is this:
    a. Messages with now-mangled Authentication-Results header fields exist in the local Outlook database, mixed with all her other normal cached messages.
    b. These messages NO LONGER EXIST on the server.
    b. If the cache is ever cleared for any reason (manually or automatically)[14], or the user creates a new Outlook profile, the messages in question are likely gone forever.

What can be done?

  1. Once I am finished defining the bounds of this issue, I will be filing a priority issue with the MAPI/Outlook team at Microsoft. [5] [7]
  2. IMAP servers (such as @cyrusimap/cyrus-imapd ) can be programmed to strip out Authentication-Results headers that contain experimental / unregistered subfields [8] [9], as recommended by RFC 8601.
  3. x-method fields [10] produced by @fastmail/authentication_milter could be either
    a. disabled by default (since they are deemed not suitable for production use by RFC 8601), or
    b. the x-method fields could be moved to another message header (X-Authentication-Results?)
  4. The various subfields (method=results + reason + property=value fields) could be split into multiple Authentication-Results header fields. [6]

Conclusion

Outlook appears to be broken. It has likely always been broken in this way. (That said, it was broken in a way that didn't cause this type of inadvertent data loss until extremely long Authentication-Results fields became commonplace.) Millions of people (at least) use Outlook; some percentage of them use IMAP. How many of them are now permanently losing messages without even knowing it?

Who should fix this? Probably everyone. (Internet Standards and RFCs are many and complicated, though, so I understand the struggle.)

From Appendix C of RFC 8601:

It is typically easier to change a single MUA than an MTA because the modification affects fewer users and can be pursued with less care. However, changing many MUAs is more effort than changing a smaller number of MTAs.

Thanks for reading this far. Suggestions and comments?

-Jim


Disclaimer: I don't work for @microsoft or @fastmail. Everything here is based on behavior I have personally observed during my research of this issue. I'm not a programmer; troubleshooting closed-source programs and encrypted mail transfer protocols isn't the most straightforward task; here may be dragons.


References and notes:

(click to expand/collapse)

[1] As does its predecessor, Internet Message Format, RFC 2822

[2] The original ARPA INTERNET TEXT MESSAGES specification, RFC 822, mentions the 'folding' process but does not explicitly acknowledge an unlimited field length.

[3] Section 2.1.1 of RFC 5322: Not including the CRLF; each line in a finished message may be 1000 characters total

Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters, excluding the CRLF.

[4] In practice, of course, many software systems are likely to have some finite limit on maximum string length. (The programmer should be capable of checking to make sure input doesn't overflow the available space, of course.)

[5] Even if Microsoft quickly releases a patch, however, those using legacy clients (e.g. Outlook 2013, like perhaps some of our parents or grandparents are) might not be able to easily apply the patch or to upgrade to a newer version (for cost or other reasons).

[6] Section 2.1 of RFC 7601:

The header field MAY appear more than once in a single message, or more than one result MAY be represented in a single header field, or a combination of these MAY be applied.

[7] Section 7.8 of RFC 8601 (substantially unchanged from RFC 7601 / 7001 / 5451)

Intentionally Malformed Header Fields

As with any other header field found in the message, it is possible for an attacker to add an Authentication-Results header field that is extraordinarily large or otherwise malformed in an attempt to discover or exploit weaknesses in header field parsing code. Implementers must thoroughly verify all such header fields received from MTAs and be robust against intentionally as well as unintentionally malformed header fields.

[8] Section 2.7.6, Extension Methods, of RFC 7601 (emphasis added):

Experimental method identifiers MUST only be used within ADMDs that
have explicitly consented to use them
. These method identifiers and
the parameters associated with them are not documented in RFCs.
Therefore, they are subject to change at any time and not suitable
for production use
. Any MTA, MUA, or downstream filter intended for
production use SHOULD ignore or delete any Authentication-Results
header field that includes an experimental (unknown) method
identifier.

[9] Section 2.7.7, Extension Result Codes, of RFC 7601 (emphasis added):

Experimental results MUST only be used within ADMDs that have
explicitly consented to use them.
These results and the parameters
associated with them are not formally documented. Therefore, they
are subject to change at any time and not suitable for production
use
. Any MTA, MUA, or downstream filter intended for production use
SHOULD ignore or delete any Authentication-Results header field that
includes an extension result.

[10] Example method identifiers that are not registered with the IANA "Email
Authentication Property Types" registry and their associated handler files:

[10.5] Technically, I don't believe this is actually Outlook. There is a Windows DLL that manages the MAPI message store for all compatible applications. Messages are not internally stored in "RFC 822/2822/5322 Internet Message Format" but in the MAPI format used by pre-internet mail Exchange. This conversion process is choking on mismatched CR/LFs generated by another part of the process (when the mail was fetched over IMAP?), as explained in the next footnote.

[11] It appears to attempt to wrap the line by inserting a new line and a tab: 0x0d0a09, but it actually places that inside of another CRLF, resulting in something like 0x0d0d0a090a.

  • Even worse, each time the message is opened, the header field is unfolded again, and since it still exceeds 1000 characters (by even more now), yet another CR/LF/Tab group is added, resulting in 0x0d0d0a0d0a090a, then 0x0d0d0a0d0a0d0a090a, then 0x0d0d0a0d0a0d0a0d0a090a, then ...

[12] I'll have to check with the @cyrusimap folks to see if this complies with section 4 of RFC 5322. (SMTP servers aren't supposed to accept 'naked'/unpaired newline characters, but my spouse won't let me read any more RFCs tonight to verify what the current recommendations are for IMAP servers)

[13] Messages with the subject "Synchronization Log" are created in a folder called "Sync Issues", but that folder is NOT visible by default. The user needs to open a pop-up menu, switch away from the standard 'Mail' view to this hidden 'Folder' view, and then scroll (perhaps quite far) to find this folder. Other than these messages, there is no indication of error, unless you specifically notice a message is missing from the server.
image

[14] The database (mailbox.ost) is considered a cache with messages assumed to always be on the server, so both the program and public documentation include deleting it as a common troubleshooting technique. (Nearly universally, the only downside listed is the time/bandwidth needed to re-download the messages. The possibility of data loss is not on the public radar here, at all.)