zip-rs/zip-old

Unicode path not supported?

qiuzhanghua opened this issue · 3 comments

I use this crate to unzip a file with archive path including unicode name, It seems not correct.

https://github.com/qiuzhanghua/unzip

cargo run

should be

File 0 extracted to "中文/"
File 1 extracted to "中文/.DS_Store" (6148 bytes)
File 2 extracted to "__MACOSX/中文/._.DS_Store" (120 bytes)
File 3 extracted to "中文/folder/"
File 4 extracted to "中文/目录/"

instead of

File 0 extracted to "中文/"
File 1 extracted to "中文/.DS_Store" (6148 bytes)
File 2 extracted to "__MACOSX/中文/._.DS_Store" (120 bytes)
File 3 extracted to "中文/folder/"
File 4 extracted to "中文/目录/"

Thankyou for the example archive! I'll have a look at how it's representing the filenames (encoding in zip files is a little troublesome, I imagine this is using the unicode Extra Data field, which we don't support)

See also #188

This archive is not using the Unicode Extra Data field. It's encoding the file names in UTF-8, but then not setting the utf8 bit in flags, which normally means the filename is CP437. This isn't a bug in zip, it's a bug in the program that created the archive.