Can't unzip .zip files created by OS X 10.12's native zipper
coolaj86 opened this issue · 10 comments
Error
On Mac OS X 10.12.6 the native zip utility (right-click "compress folder") creates a zip file that gives this error:
extra field length exceeds extra field buffer size
Not that I don't believe it's possible for Apple to make a mistake, but my guess is that the underlying zip utility is probably the same Unix / Darwin zip that has been in use for a very long time and that it's statistically more likely to be a problem with this library.
For reference, here is the failing zip file (just a couple of test files for a website):
Example Zip File
I've attached fails-to-parse.zip for the test case.
The contents are as follows:
.
├── css
│ ├── custom.css
│ ├── plugins
│ │ └── bootstrap.css.min
│ └── styles.css
├── index-three.html
├── index-two.html
├── index.html
└── js
├── app.js
├── config.js
└── custom.js
I have also zipped files that do unzip just fine. My guess is that there's some sort of off-by-one byte alignment issue.
Test Case
And a test case (based on the example):
mkdir -p /tmp/yauzl-test
pushd /tmp/yauzl-test
npm install yauzl@latest
touch test.js
wget -c https://github.com/thejoshwolfe/yauzl/files/1420073/fails-to-parse.zip
# => copy code below into test.js <= #
node test.js
test.js
:
'use strict';
var yauzl = require('yauzl');
yauzl.open('./fails-to-parse.zip', function (err, zipfile) {
if (err) throw err;
zipfile.readEntry();
zipfile.on('entry', function (entry) {
if (/\/$/.test(entry.fileName)) {
zipfile.readEntry();
return;
}
zipfile.openReadStream(entry, function(err, readStream) {
if (err) throw err;
readStream.on("end", function() {
zipfile.readEntry();
});
readStream.on("data", function (data) {
console.log("data of length", data.length);
});
});
});
zipfile.on('error', function (err) {
throw err;
});
zipfile.on('end', function () {
console.log("all entries read and processed");
});
});
Thanks for the report. i'll take a look and get back to you.
I'm getting this same error on a file created with command line zip
on macOS.
Tried zipping using BetterZip. Different error:
Error: unexpected EOF
So nothing works for me yet. The weird thing is, this error is being thrown and it's not coming through the callback. AFAIK, zipfile.on('entry', ...)
callback function only has entry
as an argument, no error object.
Side note: is there no easy way to just unzip and write the files to a folder? Without having to mess with streams, readEntry
, etc?
I tried with a few of the other node.js libs namely jszip
, unzipper
(and improved fork of unzip
), and node-stream-zip
and they work.
unzipper
and node-stream-zip
seem to be the best for me.
If you don't want streaming (i.e. an entire 1gb file would have to load into memory) then adm-zip is probably the most loved and respected. If you need browser support too then jszip
may be the right fit.
I went with one called decompress
I believe. Simple, works well. I was on a deadline, so couldn't wait to hear back on this issue.
I am unable to reproduce your issue. The zipfile you linked seems to parse just fine for me. I can explain what that particular error means, but it doesn't seem to be relevant to your zipfile.
Also, I believe decompress
is implemented by using yauzl
behind the scenes, so I'm not sure how switching to that library resolved this issue for you. Perhaps there are different versions of yauzl
at play here.
I tried with a few of the other node.js libs ...
jszip
- i didn't find much API documentation, but it looks like it operates on ram buffers, which means it's unsuitable for large zipfiles.
unzipper
and node-stream-zip
attempt to parse zipfiles from start to finish, which means they don't support some well-formed zipfiles, specifically those using general purpose bit 3. see No Streaming Unzip API in yauzl
's readme.
Here's a fairly harmless looking way to create a zipfile using Info-ZIP (which is the zip
command line program in most unix-like environments), which cannot be parsed with a start-to-finish streaming parser:
$ echo -ne '\x50\x4b\x07\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > file.txt
$ zip -q0 - file.txt | cat > out.zip
My experimentation confirms my theory that unzipper
cannot parse that zipfile.
Here's me showing it fail in about 72 seconds: https://youtu.be/0rKj9TZkPD4
Here are my versions:
yauzl-test@1.0.0 /private/tmp/yauzl-test
└─┬ yauzl@2.8.0
├── buffer-crc32@0.2.13
└─┬ fd-slicer@1.0.1
└── pend@1.2.0
node --version
7.8.0
npm --version
4.2.0
Just updated to latest node, npm, and retried after rm -rf node_modules
.
aj@bowie /p/t/yauzl-test> npm install --save yauzl
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN yauzl-test@1.0.0 No description
npm WARN yauzl-test@1.0.0 No repository field.
+ yauzl@2.8.0
added 4 packages in 4.303s
aj@bowie /p/t/yauzl-test> node test.js
/private/tmp/yauzl-test/test.js:28
throw err;
^
Error: extra field length exceeds extra field buffer size
at /private/tmp/yauzl-test/node_modules/yauzl/index.js:305:83
at /private/tmp/yauzl-test/node_modules/yauzl/index.js:618:5
at /private/tmp/yauzl-test/node_modules/fd-slicer/index.js:32:7
at FSReqWrap.wrapper [as oncomplete] (fs.js:660:17)
aj@bowie /p/t/yauzl-test> node --version
v8.8.1
aj@bowie /p/t/yauzl-test> npm --version
5.4.2
I also tried rm -rf node_modules package-lock.json; npm install --save yauzl
to make sure I got the latest versions.
Ah. Sorry about that. I was so focused on the zip file that I completely neglected to look at your sample code. My bad.
I am able to reproduce the issue now. The video helped!
The problem is that you're using both lazyEntries:false
and readEntry()
. You probably forgot to put {lazyEntries:true}
in your yauzl.open()
call.
This is such a easy mistake to make that I'm adding a check for it to throw an exception rather than this "undefined behavior". This change should make this pitfall much more friendly to navigate. Here's an example of what you get with this change:
$ node asdf.js
/home/josh/dev/yauzl/index.js:231
if (!this.lazyEntries) throw new Error("readEntry() called without lazyEntries:true");
^
Error: readEntry() called without lazyEntries:true
at ZipFile.readEntry (/home/josh/dev/yauzl/index.js:231:32)
at /home/josh/dev/yauzl/asdf.js:8:11
at /home/josh/dev/yauzl/index.js:36:7
at /home/josh/dev/yauzl/index.js:137:16
at /home/josh/dev/yauzl/index.js:623:5
at /home/josh/dev/yauzl/node_modules/fd-slicer/index.js:32:7
at FSReqWrap.wrapper [as oncomplete] (fs.js:629:17)
And just for reference, the {lazyEntries:true}
way is the recommended way, and the only reason it's not the default is to maintain compatibility with earlier versions of yauzl.
Thank you very much. I can confirm that several files that threw errors before now parse as expected...
I'm not sure how I didn't add the lazyEntries: true
before, but I'm just grateful for your help.