/spdc

Primary LanguageGoMIT LicenseMIT

spdc (Sparse DeCompresser)

spdc is a high-performance tool written in Go for decompressing gzip files while maintaining their sparse characteristics. Often when large images or large sparse files are decompressed using standard gzip tools, the sparsity of the files is lost, consuming significant disk space. By contrast, spdc ensures that the sparsity is preserved, thereby saving disk space.

Features

  • Decompress gzip files while preserving their sparse properties to save disk space.
  • High-speed decompression leveraging Go's built-in concurrency model.
  • Supports only .gz (gzip) files

why output sparse file?

In the case of a large image file, the file size is large, but the actual data is small. If you decompress the image file with gzip, the file size will be the same as the original file size.

$ ls -lah ubuntu-2204.img.gz
-rw-r--r--  1 masahiro331  staff   760M  6 19 23:53 ubuntu-2204.img.gz

$ file ubuntu-2204.img.gz
ubuntu-2204.img.gz: gzip compressed data, was "ubuntu-2204.img", last modified: Sun Jun 18 12:17:19 2023, from Unix, original size modulo 2^32 0

Use spdc to decompress the image file. Actual data size is 1.8G.

$ time ./spdc ubuntu-2204.img.gz
real    1m8.568s
user    0m56.180s
sys     0m50.441s

$ ls -lah ubuntu-2204.img
-rw-r--r--  1 masahiro331  staff    40G  6 19 23:56 ubuntu-2204.img

$ du -h ubuntu-2204.img
1.8G    ubuntu-2204.img

Use gzip to decompress the image file. Actual data size is 40G.

$ time gzip -d ubuntu-2204.img.gz
real    1m39.572s
user    1m5.913s
sys     0m7.954s

$ ls -lah ubuntu-2204.img
-rw-r--r--  1 masahiro331  staff    40G  6 19 23:53 ubuntu-2204.img

$ du -h ubuntu-2204.img
 40G    ubuntu-2204.img

Usage

spdc [filename]

Installation

You can install spdc directly from our GitHub repository or get github release.

go install github.com/masahiro331/spdc/cmd/spdc@latest

Prerequisites

Go 1.19 or higher

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.