This repo contains software implementation of the Huffman Coding file compression algorithm meant to compress text files.
C++ files implementing the algorithm may be found in the src
subdirectory, while examples of text files compressed by the software
may be found under examples
.
To build the project, run the makefile that can be found in src
to ensure
that source code is compiled with the correct version of C++ specified for g++.
This will generate an executable called huffman_encoder
To compress a file, run the following command:
./[exe name] [path to .huf or .txt file]
Doing so with a file path to a text file will generate a compressed file
(same name but with the extension .huf
) in the same direcotory, while
running the command with a file path to a compressed .huf
file will generate a
decompressed text file, again of the same name but with a .txt
extension.
Using the executable generated by the provided makefile, to compress a text file containing the entire contents of Lewis Carroll's novel "Alice's Adventures in Wonderland", a user might run a command like the following:
./huffman_encoder /Users/johndoe/Documents/alice29.txt
Doing so will generate a file called alice29.huf
in the same directory.
To decompress the resulting .huf
file generated by the above command, a user may run an additional
command like the following to get back the original text file:
./huffman_encoder /Users/janedoe/Downloads/alice29.huf
Doing so will generate a file called alice29.txt
in the same directory.
For very small files on the order of a few bytes, this software will tend to create a .huf
file
of greater size than the original .txt
file. However, for text files much bigger, the .huf
file generated
will be much smaller than the original file. As text files increase in size, the resulting .huf
files will
gradually approach a size of ~57% that of the original text files.
Below is a table summarizing the results of running the software on the files provided in the examples
folder:
File Name | Original Size | Compressed Size | Compressed File to Original File Size Ratio |
---|---|---|---|
Sam.txt | 13B | 23B | 176.92% |
asyoulik.txt | 122KB | 74KB | 60.66% |
lcet10.txt | 417KB | 245KB | 58.75% |
alice29.txt | 149KB | 86KB | 57.72% |
plrabn12.txt | 471KB | 269KB | 57.11% |