/Huffman_Encoder

A C++ implementation of the Huffman Encoding algorithm for text files.

Primary LanguageC++

Huffman Coding Compression Project

Noah Himed

12 September 2020

This repo contains software implementation of the Huffman Coding file compression algorithm meant to compress text files.

C++ files implementing the algorithm may be found in the src subdirectory, while examples of text files compressed by the software may be found under examples.

Running the Software

To build the project, run the makefile that can be found in src to ensure that source code is compiled with the correct version of C++ specified for g++. This will generate an executable called huffman_encoder

To compress a file, run the following command:

./[exe name] [path to .huf or .txt file]

Doing so with a file path to a text file will generate a compressed file (same name but with the extension .huf) in the same direcotory, while running the command with a file path to a compressed .huf file will generate a decompressed text file, again of the same name but with a .txt extension.

Example

Using the executable generated by the provided makefile, to compress a text file containing the entire contents of Lewis Carroll's novel "Alice's Adventures in Wonderland", a user might run a command like the following:

./huffman_encoder /Users/johndoe/Documents/alice29.txt

Doing so will generate a file called alice29.huf in the same directory.

To decompress the resulting .huf file generated by the above command, a user may run an additional command like the following to get back the original text file:

./huffman_encoder /Users/janedoe/Downloads/alice29.huf

Doing so will generate a file called alice29.txt in the same directory.

Reduction in File Size

For very small files on the order of a few bytes, this software will tend to create a .huf file of greater size than the original .txt file. However, for text files much bigger, the .huf file generated will be much smaller than the original file. As text files increase in size, the resulting .huf files will gradually approach a size of ~57% that of the original text files.

Below is a table summarizing the results of running the software on the files provided in the examples folder:

File Name Original Size Compressed Size Compressed File to Original File Size Ratio
Sam.txt 13B 23B 176.92%
asyoulik.txt 122KB 74KB 60.66%
lcet10.txt 417KB 245KB 58.75%
alice29.txt 149KB 86KB 57.72%
plrabn12.txt 471KB 269KB 57.11%