maennchen/ZipStream-PHP

ZipStream Saves Multiple Files With Same Path

oleibman opened this issue · 6 comments

Description of the problem

See problem 1516 for PhpSpreadsheet:
PHPOffice/PhpSpreadsheet#1516
Code is attempting to save 2 files in zip archive with identical paths.
With native PHP zip, only 1 copy is saved (presumably the second, but it is almost certain that
the file contents are identical).
With ZipStream, both copies are saved with an identical path, and Excel cannot open the
file correctly.
It is not clear to me whether this is a bug. If it is, can it be fixed?
If it is not, is there an option which could be used to avoid saving the duplicate?
If not, can you add one?

Please be very descriptive and include as much details as possible.

Example code

In report in PhpSpreadsheet.

Informations

  • ZipStream-PHP version: 2.0.0
  • PHP version: 7.2, 7.3, 7.4

Please include any supplemental information you deem relevant to this issue.

Hello,

Please provide sample code to reproduce the issue.

Here is code where ZipStream creates 2 files in zip archive with identical paths:

use ZipStream\Option\Archive;
use ZipStream\ZipStream;

$fileHandle = fopen('xzip.zip', 'wb');
$options = new Archive();
$options->setEnableZip64(false);
$options->setOutputStream($fileHandle);
$zip = new ZipStream(null, $options);

$zip->addFile('file1.txt', 'file1 data');
$zip->addFile('file1.txt', 'file1 data');
$zip->finish();

echo "Saved xzip.zip\n";

Here is code where ZipArchive only creates 1 file in zip archive, despite attempt to add it twice:

$zip = new ZipArchive();
if ($zip->open('xzip2.zip', ZipArchive::CREATE) === true) {
    $zip->addFromString('file1.txt', 'file1 data');
    $zip->addFromString('file1.txt', 'file1 data');
    $zip->close();
}

echo "Saved xzip2.zip\n";

xzip.zip
xzip2.zip

xzip.zip created by ZipStream; xzip2.zip created by ZipArchive

Yes, we can definitely label that behaviour as a bug. I'm not sure what would be the best way to squash it. @maennchen any idea?

@NicolasCARPi I think it the answer here is: it depends

We could potentially keep a list of existing paths and reject duplicates. This however introduces a memory leak which could lead to problems with zip files that contain a lot of files.

For sure we can't override the file since that would mean to change the zip file which is incompatible with streaming it.

In my opinion we should not change the code, document the behavior and warn people not to do that.