Handle embedded (INLINE) resources as attachments if the cid identifier doesn't occur in the HTML body
jkxyx205 opened this issue · 14 comments
Hi, Benny Bottema
I have a problem, the details are as follows。
1. Scene description
Here is my hello.eml file which include a attachment and a embedded image exported by client application Foxmail. Open with outlook screenshot below:
2. Parse eml but can't get any information
Email oldEmail = EmailConverter.emlToEmail("/Users/rick/jkxyx205/tmp/hello.eml");
Can you perform the same check for EmailConverter.emlToMimeMessage() please?
I can't parse any thing either.
return new MimeMessage(session, new ByteArrayInputStream(eml.getBytes(UTF_8)));
Maybe java.mail
does not have the ability to parse this file but outlook can.
Hmm, maybe this EML requires a specific javax.mail version? I'll give it a try as well, but I'm affraid this is a limitation of the underlying Java Mail framework.
I'm getting a completely empty result without errors.
For what it's worth, the EML is not validating the following validation tools:
But mimevalidator.net thinks it's fine.
Sorry, I got confused. It's working fine for me. Here's my result:
InputStream resourceAsStream = EmailHelper.class.getClassLoader().getResourceAsStream("test-messages/hello.eml");
Email e = EmailConverter.emlToEmail(resourceAsStream);
Perhaps it wasn't clear from the API, but the method you used expetcs EML data, not a filepath string.
Email oldEmail = EmailConverter.emlToEmail("/Users/rick/jkxyx205/tmp/hello.eml"); // <-- this should be file content
Yes, you are right, what a ridiculous mistake I did.
But I found another problem, lost a embeded image as I metioned #173, but 2 attachments
If an embedded attachment isn't actually embedded it is treated as an attachment. Is it being used in the body?
Outlook treat the mail as a attachment and a embedded image.
Ok,I will continue to track it later. Thank you for your help, simple-java-mail excellent framework!!!
Hmm, finally I got the root cause.
When export EML by client foxmail which does't specify header Content-Disposition: inline; filename=image.png
when it is a embedded image. If the disposition
is not provided, the part be treated as attachment by simple-java-mail.
See org.simplejavamail.converter.internal.mimemessage.MimeMessageParser.java
private static void parseMimePartTree(@Nonnull final MimePart currentPart, @Nonnull final ParsedMimeMessageComponents parsedComponents) {
for (final Header header : retrieveAllHeaders(currentPart)) {
parseHeader(header, parsedComponents);
}
final String disposition = parseDisposition(currentPart);
if (isMimeType(currentPart, "text/plain") && parsedComponents.plainContent == null && !Part.ATTACHMENT.equalsIgnoreCase(disposition)) {
parsedComponents.plainContent = parseContent(currentPart);
} else if (isMimeType(currentPart, "text/html") && parsedComponents.htmlContent == null && !Part.ATTACHMENT.equalsIgnoreCase(disposition)) {
parsedComponents.htmlContent = parseContent(currentPart);
} else if (isMimeType(currentPart, "multipart/*")) {
final Multipart mp = parseContent(currentPart);
for (int i = 0, count = countBodyParts(mp); i < count; i++) {
parseMimePartTree(getBodyPartAtIndex(mp, i), parsedComponents);
}
} else {
final DataSource ds = createDataSource(currentPart);
// If the diposition is not provided, the part should be treated as attachment
if (disposition == null || Part.ATTACHMENT.equalsIgnoreCase(disposition)) {
parsedComponents.attachmentList.put(parseResourceName(parseContentID(currentPart), parseFileName(currentPart)), ds);
} else if (Part.INLINE.equalsIgnoreCase(disposition)) {
if (parseContentID(currentPart) != null) {
parsedComponents.cidMap.put(parseContentID(currentPart), ds);
} else {
// contentID missing -> treat as standard attachment
parsedComponents.attachmentList.put(parseResourceName(null, parseFileName(currentPart)), ds);
}
} else {
throw new IllegalStateException("invalid attachment type");
}
}
}
If I add Content-Disposition: inline; filename=image.png
manually, it works fine. ˆ-ˆ
So my question is coming, why can't I treated as embedded image if disposition
is not provided,behaves normally like Outlook.
BTW, Outlook, Mac Mail, Foxmail can parse hello.eml correctly, one attachment + one embedded image.
So my question is coming, why can't I treated as embedded image if disposition is not provided,behaves normally like Outlook.
The spec says the following about default Content-Disposition in case of absence:
The Content-Disposition Header Field
Content-Disposition is an optional header field. In its absence, the
MUA may use whatever presentation method it deems suitable.
So spec-wise, we're free to do as we see fit. However, I'm unsure of the best handling here. So all the clients you mention don't parse it 'correctly' so much as rather how they see fit. It's all correct.
Here's what I'm going to do: treat all resources with missing Content-Disposition header as INLINE. Then if an INLINE resource (like an embedded image) does not occur in the HTML body (ie. cid:myImage), treat it as an attachment instead.
This also treats embedded images with proper INLINE disposition as attachment if the image is not actually used in the HTML body.
@jkxyx205, I've released a new SNAPSHOT with the fix. Can you please verify (you'll need to add the snapshot repo).
I tested it, and the performance was as good as I expected. It was really great.
Released in 5.1.0