bbottema/simple-java-mail

Multiple Bodyparts of same Content-Type not supported for text/html & text/plain within Multipart/mixed or Multipart/alternative

pmaedel opened this issue · 9 comments

Current implementation (conditional null check) leads to only the first occurrence of a body part with the respective Content-Type to be parsed. All further occurrences are omitted. Correct behavior should be to either provide a List per Content-Type or concatenate the Strings.

if (isMimeType(currentPart, "text/plain") && parsedComponents.plainContent == null && !Part.ATTACHMENT.equalsIgnoreCase(disposition)) {
parsedComponents.plainContent = parseContent(currentPart);
} else if (isMimeType(currentPart, "text/html") && parsedComponents.htmlContent == null && !Part.ATTACHMENT.equalsIgnoreCase(disposition)) {
parsedComponents.htmlContent = parseContent(currentPart);

Multiple occurrences of body parts with the same Content-Type is valid as per RFC1341, section 7.2.2:

[...] the body parts are independent and intended to be displayed serially [...]

At least the Apple Mail client makes extensive use of this feature when replying or forwarding mails.

What I'm worried about is we are losing information in the conversion process (the grouping of content) as we are simply merging the body parts of the same mimetype.

For example, when parsing a MimeMessage to Email this way and sending that email will not result in the same MimeMessage. I'm not sure what the consequence is here or whether this is an issue that should be addressed.

I have not had a look at how an email is sent, but if the htmlContent & plainContent fields are used instead of the underlying MimeMessage body parts, the previous obmission of body parts of the same content-type within these fields would have breaking the equality of representations too, for the mentioned MimeMessage -> Email -> MimeMessage scenario.

If these fields are just used for convenient access to the body parts a simple concatenation should suffice. Otherwise Lists of Strings might be a better representation for these values, as these would be better suited as a base for sending out emails that more closely resemble the original MimeMessage.

Did you run into a problem because of this issue?

Reason I'm asking is because I'm not sure in which use case multiple body parts would have the same content type for plain text or HTML text. Forwarded emails have a separate content type (message/rfc822, which is discarded when parsing a MimeMessage that contains one, since it is a proprietary format). That leaves replied-to emails. When replying, Simple Java Mail quotes the replied-to email inline and not as a separate bodypart. In this case it would make sense to simply concatenate incoming bodyparts of the same content type (text or HTML) when parsing a MimeMessage.

As it stands, I think concatenation is sufficient.

We ran into this problem when parsing forwarded mails. I can send you an example (by mail) if you want to.

I've tried the EML you sent in a couple of viewers and they all seem to have an issue with this type of sparse multipart part structure. Most of the content is completely missing. Haven't seen this before (but then again, I don't use Apple products).

I sent the EML you gave to myself using Simple Java Mail, which shows a lot more (with your fix). Seems good enough to me. Closing this issue.

Hi, how does it come that the PR has been merged but the changes are neither part of the master branch nor of the 5.0.4 release?

I switched to a proper GIT workflow and what was previously master is now develop. Master now represents what is actually released to Maven Central. So the changes for this issue are in the develop branch until 6.0.0 is released.

Released in 6.0.0-rc1!

6.0.0 has released as well, finally.