/word-compressor

Trims word lists for easier typing practice

Primary LanguagePython

Word Compression

One of the most popular tests used to practice typing is a random sequence of common words. However, these word lists generally contain a lot of repetitition. As a consequence, you'll inevitably practice certain words more often than necessary and other words not often enough.

This tool determines which words in a given list are redundant for practice and discards them while keeping all others. It does this by calculating the set of trigrams that would comprise a test and then minimizing the list such that the set remains the same size.

Results

Here are some wordlists from the popular typing website Monkeytype and their word counts before and after compression: