brendan-duncan/archive

Reusing `AchiveFile` can result in corrupted archives

Opened this issue · 8 comments

Here are some patterns that previously worked with package:archive that no longer with 3.5.1

final destinationArchive = Archive();
final sourceArchive = ZipDecoder().decodeBytes(bytes);
for (final file in sourceArchive.files) {
  destinationArchive.addFile(file);  // content not set, destinationArchive is corrupted
}

I think that this is the problematic code

final archive = ...
ZipEncoder.encode(archive)
final destinationArchive = Archive();
destinationArchive.addFile(archive.findFile('foo')). // encode closed the file, destinationArchive is corrupted

It would be nice if, in these cases:

  1. it would be allowed to use an ArchiveFile in multiple contexts OR
  2. using it in a disallowed context resulted in a StateError rather than a corrupted archive

I discovered these issues in the context of the Google build system, where package:archive is extensively used (thanks!). I can extract minimum reproductions and/or tests, if that would be helpful.

Also, I am currently working on a 4.0 branch, https://github.com/brendan-duncan/archive/tree/4.0. I'll make sure the fix for this gets into that branch. If there are any breaking changes for the build system in the 4.0 branch, if that's even possible to check, I can make sure they are looked into before I ship it.

The issue in 3.5.1 isn't adding an ArchiveFile to another Archive, they can exist in multiple. There is a bug with encoding a new zip before the the data of the file was loaded. I'll get that fixed up asap.

This issue also revealed a significant design flaw in my 4.0 branch, so I got some more work to do there.

I found the issue. autoClose named arg was added to encode, and set to true. It was intended to free up memory when encoding an archive. Maybe not such a good idea after all, since it breaks this behavior that I didn't anticipate.

I can switch it to false to maintain the previous behavior. ZipEncoder().encode(archive, autoClose: false); is manually using the previous behavior.

Could changing the behavior back not break people relying on the new behavior?

Changing it to false wouldn't break anything, unless they are relying on the fact that the memory was freed on encode, but it wouldn't break any logic.

The whole logic of ZipFile content is a mess. One of the reasons why I wanted to do a major version update, try and wrangle that beast.

I can't believe how many people want to unzip a 4GB+ zip file on an old Android. People want to do weird stuff.

I think that this PR at least partially fixes this issue: #337