/wordlist-tool

Chainable, high-performance tool for manipulating wordlists.

Primary LanguageC#

Wordlist Tool

A chainable, high-performance tool for manipulating wordlists. Wordlists can be used for many purposes: word games, allow or deny lists, even for password cracking.

Usage

Usage:
  wl [command] [options]

Options:
  --encoding <encoding>        Default encoding of the wordlist file. [default: ASCII]
  --line-ending <line-ending>  Default line ending sequence. [default: 0A]
  --buffer-size <buffer-size>  Default buffer size for reading and writing. [default: 16384]
  --version                    Show version information
  -?, -h, --help               Show help and usage information

Commands:
  transform       Transform entries.
  filter          Filter entries.
  sort            Sort entries.
  list            Transform list.
  merge           Merge entries from multiple lists.
  split           Split a list into multiple lists.
  extract         Extract entries from other formats.
  generate        Generate entries.

For example:

wl extract --inputs books/*.txt --output OUT --regex \w+ |
wl filter distinct IN OUT |
wl sort asc IN OUT |
wl transform-list take IN output.txt 1000

Sort

You can sort entries in ascending order the following way:

wl sort entries list.txt

Output can be saved to another file:

wl sort entries list.txt output.txt

Sort entries in a descending way:

wl sort entries list.txt --descending

Sort entries by length:

wl sort length list.txt

Sort entries by length descending:

wl sort length list.txt --descending

Reverse sort order:

wl sort reverse list.txt

Filter

Remove whitespace entries:

wl filter whitespace list.txt output.txt

Remove duplicates

wl filter distinct list.txt output.txt

Filter entries which match (contain) a regular expression:

wl filter regex list.txt output.txt \d+

Filter entries by a regular expression by full match:

wl filter regex list.txt output.txt \A\d+\Z

Filter entries by minimum length:

wl filter min-length list.txt output.txt --length 3

Filter entries by maximum length:

wl filter max-length list.txt output.txt --length 3

Transform

Trim leading and trailing whitespace:

wl transform trim list.txt output.txt

Convert entries to uppercase:

wl transform uppercase list.txt output.txt

Convert entries to lowercase:

wl transform lowercase list.txt output.txt

Prepend to entries:

wl transform prepend list.txt output.txt --value "prefix"

Append to entries:

wl transform append list.txt output.txt --value "suffix"

Reverse entries:

wl transform reverse list.txt output.txt

No not change entries. This transform is useful to either display:

wl transform identity list.txt OUT

Or change encoding or line endings of a list:

wl transform identity in.txt out.txt --input-encoding UTF8 --output-encoding ASCII

Merge

Concatenate multiple lists together:

wl merge concat --inputs list1.txt list-*.txt --output output.txt

Concatenate multiple lists together as raw bytes without reprocessing, separated by --output-line-ending. This is the fastest way to concatenate lists:

wl merge binary-concat --inputs list1.txt list-*.txt --output output.txt

Union of multiple lists:

wl merge union --inputs list1.txt list-*.txt --output output.txt

Combine/zip together multiple lists line by line with no separator:

wl merge zip --inputs list1.txt list2.txt --output output.txt

Combine/zip together multiple lists line by line with a separator:

wl merge zip --inputs list1.txt list2.txt --output output.txt --separator ":"

Combine each line with each other line with no separator:

wl merge cross --inputs list1.txt list2.txt --output output.txt

Combine each line with each other line with a separator:

wl merge cross --inputs list1.txt list2.txt --output output.txt --separator ":"

Combine each line with each other line of itself with a separator:

wl merge cross --inputs single.txt --output output.txt --separator ":"

Remove entries which can be found in other lists:

wl merge except --inputs list.txt except-these.txt and-these-*.txt --output output.txt

List operations

Take first N entries:

wl list take list.txt output.txt --count 500

Take last N entries:

wl list take-last list.txt output.txt --count 500

Skip first N entries:

wl list skip list.txt output.txt --count 500

Skip last N entries:

wl list skip-last list.txt output.txt --count 500

Split

Split a single list into multiple chunks by number of entries:

wl split entries list.txt output-{0}.txt --count 500

Split into chunks by number of bytes (but do not break any entries):

wl split bytes list.txt output-{0}.txt --bytes 1048576

Split by length:

wl split length list.txt output-{0}.txt

Split by regular expression into matching and non-matching lists:

wl split regex list.txt output-{0}.txt --regex \d+

Extract

Extract words from files using a regular expression (\w+ by default):

wl extract regex --inputs books/*.txt --output output.txt 

Using a custom regular expression:

wl extract regex --inputs books/*.txt --output output.txt --regex [a-z]+

Generate

Generate entries using a specific charset:

wl generate new output.txt --charset 0123456789 --min-length 1 --max-length 5

Configuration

Input bindings cardinality

Wordlist Tool tries to process lists by streaming to avoid locking resources and buffering whole lists into memory (whenever possible). This is why in general both an input and output path must be provided:

wl transform trim intput.txt output.txt

In some special cases, where buffering is inevitable (e.g. sorting, distinct), the same path can be used:

wl sort asc file.txt

Some operations support working with multiple inputs (merge, extract). You can either specify multiple files by their path:

wl merge union --inputs file1.txt file2.txt --output out.txt

Or use glob patterns:

wl merge union --inputs first.txt file-*.txt last.txt --output out.txt

Note: patterns are evaluated individually, and if multiple patterns match the same file, it is going to be included multiple times.

Standard Input/Output bindings and chaining

You can use the reserved word IN and OUT to bind either input or output to standard input/output:

wl transform lower file.txt OUT

This makes it possible to chain commands together:

wl transform lower file.txt OUT |
wl filter distinct IN OUT |
wl sort asc IN final.txt

Encoding

Encoding can be specified, default is ASCII:

wl transform lower in.txt out.txt --input-encoding UTF-8 --output-encoding ASCII

You can use the identity transform to change encoding of a list:

wl transform identity in.txt out.txt --input-encoding UTF8 --output-encoding ASCII

Line endings

Line endings can be specified in HEX notation, default is 0A:

wl transform lower in.txt out.txt --input-line-ending 0D0A --output-line-ending 0A

You can use the identity transform to change line endings of a list:

wl transform identity in.txt out.txt --input-line-ending 0D0A --output-line-ending 0A

Buffering

Read and write buffering can be specified the following way:

wl transform lower in.txt out.txt --input-buffer-size 4096 --output-buffer-size 16384