The Task: Losslessly compress the 1GB file enwik9 to less than 114MB. (Current Benchmark)
This compression contest (Hutter Prize) is motivated by the fact that being able to compress well is closely related to acting intelligently, thus reducing the slippery concept of intelligence to hard file size numbers.
Landauer's principle is a physical principle pertaining to the lower theoretical limit of energy consumption of computation. It holds that an irreversible change in information stored in a computer, such as merging two computational paths, dissipates a minimum amount of heat to its surroundings.
Landauer's principle states that the minimum energy needed to erase one bit of information is proportional to the temperature at which the system is operating. More specifically, the energy needed for this computational task is given by
where
How close is the brain? And what is the opposite of the Landauer limit? And therefore, the limit of intelligence? How far off are we as the human species?
# configs
#
# -a <algorithm> Options: 'huffman'
# -d <input file>
python boiler.py -d data/enwik4 -a huffman
Within and after development, there is a test script for evaluating the correctness of the different algorithms.
python -m unittest
- Hutter Prize: Frequently Asked Questions & Answers
- Huffman Coding: Decent starting point for lossless compression.
- Arithmetic Coding (AC) is a form of entropy encoding used in lossless data compression.
- enwik: About the Test Data.