rust-lang/flate2-rs

Extract gzip header into a crate? Maybe other metadata?

Opened this issue · 7 comments

In my zopfli implementation, it would be useful to have your gzip into_header and do_finish footer) code, but without the actual compression logic. The C implementation of zopfli was actually kind of lazy and doesn't set the mtime or OS codes in the gzip header, so that adds weight in my mind that a crate would be useful for people who don't want to implement it themselves!

Mostly opening this to get your opinion-- do you think it'd be worthwhile to extract this into a separate crate, and would you accept a PR that used that crate in this library?

Maybe some of the other formats' metadata generation too, but I'm not as familiar with those.

Yeah I'd be fine to extract this out to a separate type in the module at the very least, and perhaps yeah another crate. This may also end up interacting with #43 as well!

I've started a crate here based on the code in flate2 as I needed it for the deflate crate. I don't know if it's usable for flate2 due to the extra dependency on crc-32 (as depending on miniz/zlib for crc functionality wouldn't make sense), though it may be of interest for other crates.

Also let me know if there is anything lacking when it comes to attribution, not sure if what's there right now is sufficient.

Thanks for the update @oyvindln! I may not update flate2 just yet due to the extra deps, but it may be of interest to @carols10cents

A small update if it's of interest to anyone:
The dependencies for gzip-header are now cleaned up. I avoided enum-primitive as it wasn't really needed, so crc is the only dependency now. The latest version of the crc crate have also removed the dependency on lazy_static and only depends on build_const and the internal crc-core crate.

Keats commented

I can confirm it would be very nice to export a method to render the gzip header and the CRC writing when you have to use the Compress struct. I have seen the crate mentioned above but it really is a matter of a couple lines of change to make it nice and then whether it is coming flate2 directly or is re-exported is not really important.

Namely GzBuilder::into_header is currently private and there is no other way to generate a header currently? Any reason this has to be private? You can hardcode the header if you need but it would be nice to have this public.

Another thing is the code writing the CRC bytes from https://github.com/alexcrichton/flate2-rs/blob/master/src/gz/write.rs#L99-L114 . Could the byte slice creation be a public method of the Crc struct instead?

I can do a PR for both if that's acceptable

The current interface hasn't ever been designed to be exposed, it's just grown organically over time. Given a once-over to make sure it's feasible to expose publicly it seems plausible to expose it.