/homophones

Generate lists of homophones

Primary LanguageRustOtherNOASSERTION

Homophones

A Rust command-line tool that makes a list of homophones from Wikitionary.

What's in this project

  • Some lists of homophones
  • A Rust command-line tool to generate your own list of homophones, based on an inputted word list.

Lists of homophones

If you're looking for lists of homophones, look in the homophone-lists directory.

Lists that are labeled "as pairs" contain pairs of homophones separated by commas. Example:

acts,ax
adds,ads
adds,adze

Lists labeled "as singles" are these same words, but with each on its own line, no commas:

acts
ax
adds
ads
adds
adze

Any lists with "cleaned" in the file name have been cleaned in some way, likely trimmed of whitespace, de-duplicated, and sorted alphabetically. I'd recommend using cleaned files.

Making your own list of homophones

To make your own list of homophones, you'll want to use the included command-line tool.

How it works

The command-line tool will search and scrape Wikitionary for each word in the inputted word list file(s), searching for homophones.

Installing the command-line tool

  1. Install Rust if you haven't already
  2. Run: cargo install --git https://github.com/sts10/homophones --branch main

Usage

Usage: homophones [OPTIONS] <Inputted word lists>...

Arguments:
  <Inputted word lists>...  Word list input files. Can provide more than one (they'll be combined)

Options:
  -p, --pairs <PAIRS_OUTPUT>      Path for outputted file for list of PAIRS of homophones
  -s, --singles <SINGLES_OUTPUT>  Path for outputted file for list of SINGLE homophones
  -f, --force                     Force overwrite of output file(s), if it exists
  -h, --help                      Print help
  -V, --version                   Print version

Examples

Take words from a file called input_list.txt and print a list of homophones, one per line, to a new file called some_homophones_as_a_single_list.txt: homophones -s some_homophones_as_a_single_list.txt input_list.txt

Take words from a file called input_list.txt and print a list of homophones, a pair of homophones per line, to a new file called some_homophones_as_pairs_list.txt: homophones -p some_homophones_as_pairs_list.txt input_list.txt

Do both of those things at once! homophones -s some_homophones_as_a_single_list.txt -p some_homophones_as_pairs_list.txt input_list.txt

To do

  • Make it async!

Licensing

This project scrapes homophones from Wikitionary. In an effort to comply with Wikitionary's Terms of Use, the lists/results generated by this project, including files in the ./homophone-lists directory, are, like the text of Wikitionary itself, available under the Creative Commons Attribution-ShareAlike License.

The code of this project is available under the Blue Oak Model License.