ref: https://gitlab.com/alex.stanovoy/mipt-rust/-/tree/master/problems/modules/ripgzip
In this project, you'll implement a simple gzip
files decompressor.
Note: the project name is just a reference to ripgrep
, blazingly fast grep
implementation in pure Rust.
The specification of gzip
and deflate
formats can be found in the following RFCs:
Some abstractions were already designed for your convenience. It's suggested to implement them in order:
BitReader
- reads the stream byte by byte. To run unit tests, usecargo test bit_reader
.TrackingWriter
- a writer with a 32Kb buffer that tracks the count of written bytes and CRC32 control sum. To run unit tests, usecargo test tracking_writer
.HuffmanCoding
- Huffman algorithm token decoder. To run unit tests, usecargo test huffman_coding
. Generic over token type:TreeCodeToken
- encodes lengths of Huffman codes.LitLenToken
- encodes the literal or the end of the block.DistanceToken
- encodes distance.
GzipReader
- reads header and footer ofgzip
format.DeflateReader
- reades the header ofdeflate
format.- The actual
decompress
function.
After implementing, also run ./test.py
or rover test
since this problem has additional tests.
The only things you cannot change are:
decompress
function in the filelib.rs
: is must accept the input and write to the output, since it's tested.main.rs
file is already implemented for you, but if you want to change it just make sure the binary accepts the file bystdin
and outputs the compressed result tostdout
.
You can change other details whatever you like, create new .rs
files, delete old ones, create directories inside, and so on.
The anyhow
crate is used for error handling. Don't forget to use ?
, .context()
, .with_context()
and bail!
.
The tests verify that errors have specific substring for some cases in this cases:
- The number of bytes in the
gzip
footer is not as expected: "length check failed". - The CRC32 is not equal to the one on the
gzip
footer: "crc32 check failed". - Wrong values of the first two bytes in the
gzip
header: "wrong id values". - The CRC16 is not equal to the one on the
gzip
header: "header crc16 check failed". - Unknown compression method in
gzip
header: "unsupported compression method". - Unknown block type in
deflate
header: "unsupported block type". - In block
BTYPE = 00
theLEN == !NLEN
is violated: "nlen check failed".
- For logging, use the
log
crate. The most important macros from it areerror!
,warn!
,info!
,debug!
andtrace!
. Only errors and warnings are logged by default. Use keys-v
,-vv
, and-vvv
to log more levels. - For convenient bytes reading, use trait
ReadBytesExt
frombyteorder
crate. You'll defenitely need.read_u8()
,.read_u32::<LittleEndian>()
and may be some more functions. - To calculate CRC, use
crc
crate. You'll needCrc<u32>
andCRC_32_ISO_HDLC
type of algorithm.