bbottema/simple-java-mail

EML Attachments are modified/have the wrong size

Faelean opened this issue · 2 comments

When getting attachments from an eml file there is an error when getting the actual size of the attachment. This leads to the files being different than the original files.

I've attached a small program which compares the original files to the attachments from the email object. There's one .eml file and one .msg file with four attachments each and the original four files that are attached to the mails. The program compares the bytes from both files and prints the ones that are not present (or are different) in one of the files to the console.
Thats the (shortened) output I get from one of the attachments:
EML:

------------ Documents.7z -------------
attachement unread byte: 0 at index 1462
[...]
attachement unread byte: 0 at index 1499
Unread bytes in attachment file: 38
Unread bytes in original file: 0

MSG:

------------ Documents.7z -------------
Unread bytes in attachment file: 0
Unread bytes in original file: 0

As you can see when using the msg file the contents are identical while there are differences when using the eml file.

From what I've figured so far, the error is in the MiscUtil class:

public static byte[] readInputStreamToBytes(@NotNull final InputStream inputStream)
throws IOException {
try (InputStream is = inputStream) {
byte[] targetArray = new byte[is.available()];
//noinspection ResultOfMethodCallIgnored
is.read(targetArray);
return targetArray;
}
}

The javadoc for InputStream.available states:

Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. [...]
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

(https://docs.oracle.com/javase/9/docs/api/java/io/InputStream.html#available--)

In our case the available method returns 1500 for the Documents.7z, while the actual size is 1461. The additional bytes are filled with 0 values therefore modifying the file. This creates a warning when unzipping the file with 7zip. For the attached txt file this creates unreadable characters displayed when opening with Notepad++ and whitespace when opening with the Windows Editor.

I'd suggest replacing everything inside the try block with this (or something similar). It's a slightly modified version than the one Baeldung suggests when Java 9 is not available.

ByteArrayOutputStream buffer = new ByteArrayOutputStream();
byte[] data = new byte[1024];
int read;
while ((read = is.read(data, 0, data.length)) != -1) {
	buffer.write(data, 0, read);;
}

buffer.flush();

byte[] targetArray = buffer.toByteArray();
return targetArray;

example.zip

Thanks so much for your research. I will incorporate your improvement asap.

Released in 6.4.5