Binary file seems to be corrupted with method Deflated
mass10 opened this issue · 14 comments
Summary
Binary file seems to be corrupted with method zip::CompressionMethod::Deflated
.
Code to reproduce
use std::io::{Read, Write};
/// Example of deflate.
fn main() -> Result<(), Box<dyn std::error::Error>> {
// ARCHIVE
{
// START
let w = std::fs::File::create("notepad.exe.zip")?;
let mut archiver = zip::ZipWriter::new(w);
let options = zip::write::FileOptions::default();
// let options = options.compression_method(zip::CompressionMethod::Stored); // SAFE
let options = options.compression_method(zip::CompressionMethod::Deflated); // CORRUPT🔥
archiver.start_file("notepad.exe", options)?;
// READ
let mut stream = std::fs::File::open("C:\\Windows\\system32\\notepad.exe")?;
loop {
let mut buffer = [0; 1000];
let bytes_read = stream.read(&mut buffer)?;
if bytes_read == 0 {
break;
}
// WRITE
let write_buffer = &buffer[..bytes_read];
archiver.write(&write_buffer)?;
}
// COMPLETE
archiver.finish().unwrap();
}
// UNZIP
{
let command = std::process::Command::new("wsl.exe").args(&["unzip", "notepad.exe.zip"]).spawn()?.wait()?;
println!("Command exited with: {}", command);
}
// ORIGINAL FILE SIZE
{
let meta = std::fs::metadata("C:\\Windows\\system32\\notepad.exe")?;
println!("Original NOTEPAD is: {} bytes.", meta.len());
}
// EXTRACTED FILE SIZE
{
let meta = std::fs::metadata("notepad.exe")?;
println!("Extracted NOTEPAD is: {} bytes.", meta.len());
}
return Ok(());
}
Result is
Original NOTEPAD is: 201216 bytes.
Extracted NOTEPAD is: 200903 bytes.
Thank you!
zip::CompressionMethod::Zstd
sounds good.
But I found neither of the following supports ZStandard:
- 7-zip
- explorer.exe
- unzip
Just ran into this also. Have to use Stored
for now...
You are using archiver.write
(an implementation of std::io::Write::write
) and ignoring the return value. You should be calling write
in a loop checking the count of bytes written, or use archiver.write_all
.
https://doc.rust-lang.org/stable/std/io/trait.Write.html#tymethod.write
This function will attempt to write the entire contents of buf, but the entire write might not succeed, or the write may also generate an error. A call to write represents at most one attempt to write to any wrapped object.
Oh, It was fixed!
Thank you!
I'm using write_all
and it still fails, here's my impl: https://github.com/moonrepo/starbase/blob/master/crates/archive/src/zip.rs
@milesj I thought I want to use fs::read_file_bytes()
instead of fs::read_file()
in your code.
When I try that I get this error "stream did not contain valid UTF-8", which is the same as using fs::read_file(file)?.as_bytes()
.
Both are interesting, since these test cases are very simple.
I also tried this, which failed with the same error.
let mut buffer = Vec::new();
let mut stream = fs::open_file(file)?;
stream.read_to_end(&mut buffer).unwrap();
I feel like something else may be going on.
I tried this.
/// Succeeds🌔
fn diagnose_1(path: &str) {
let data = std::fs::read(path).unwrap();
println!("(1) {} is {} bytes.", path, data.len());
}
/// Succeeds🌔
fn diagnose_2(path: &str) {
use std::io::Read;
let mut file = std::fs::File::open(path).unwrap();
let mut buffer = Vec::new();
file.read_to_end(&mut buffer).unwrap();
println!("(2) {} is {} bytes.", path, buffer.len());
}
/// Fails🌒
fn diagnose_3(path: &str) {
let text = std::fs::read_to_string(path).unwrap(); // 🔥stream did not contain valid UTF-8
let data = text.as_bytes();
println!("(3) {} is {} bytes.", path, data.len());
}
pub fn main() {
diagnose_1(r"C:\Windows\system32\notepad.exe"); //🌔
diagnose_2(r"C:\Windows\system32\notepad.exe"); //🌔
diagnose_3(r"C:\Windows\system32\notepad.exe"); //🌒
}
Result was
(1) C:\Windows\system32\notepad.exe is 201216 bytes.
(2) C:\Windows\system32\notepad.exe is 201216 bytes.
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: InvalidData, message: "stream did not contain valid UTF-8" }', src\test2.rs:18:46
It's weird, because even if I do this it fails:
self.archive
.start_file(name, options)
.map_err(|error| ZipError::AddFailure {
source: file.to_path_buf(),
error,
})?;
self.archive
.write_all(&std::fs::read(file).unwrap())
.map_err(|error| FsError::Write {
path: file.to_path_buf(),
error,
})?;
I wonder if this abstraction is causing issues once compiled.
Example PR: moonrepo/starbase#33
Not sure which file is the problem, but this is the fixture: https://github.com/moonrepo/starbase/tree/master/crates/archive/tests/__fixtures__/archives
They are very simple files.
I figured out the problem, was my mistake. The issue was I was using by_index_raw
instead of by_index
when unzipping.
I'm using Copilot and I think it generated that 😞
That's good! Everything has become clear.