/anagram-finder

Finds anagram groups in a list of words.

Primary LanguageJavaScriptMIT LicenseMIT

anagram-finder

Coverage Status

This package implements an anagram finder which locates groups of anagrams in a word list.

Installation

anagram-finder can be installed as a package dependency and can be run on the command line.

npm install @nazrhyn/anagram-finder

Prerequisites

  • Operating System - This code should work on any operating system supported by Node.js.
  • Node.js 14+ - ECMAScript 11 (2020) features are used.
  • GitHub Packages Setup - This package is published to GitHub Packages under the @nazrhyn scope. See Installing a Package for instructions.

Since the package is published to GitHub Packages, global installation will either need a @nazrhyn-scoped registry line in ~/.npmrc (@nazrhyn:registry = https://npm.pkg.github.com) or as a parameter to npm install (--@nazrhyn:registry=https://npm.pkg.github.com). This will allow all other packages to be resolved by the normal NPM registry.

Usage

Require the package to use it as a library.

const finder = require('@nazrhyn/anagram-finder');

let words;

// Do something to load the word list.

const results = finder(words);

Use the included executable to run it on the command line.

# Print usage information.
$ anagram-finder --help

# Use a word list from a file.
$ anagram-finder ./wordlist

# Read a wordlist from stdin.
$ curl --silent https://example.com/wordlist | anagram-finder

# Provide words manually from stdin.
$ anagram-finder
koas
adorer
oaks
roared
soak
<Ctrl-D> # Or the equivalent for your operating system.

# Run from package dependencies with NPX.
$ npx anagram-finder ./wordlist

API

The package exports a function, and the executable provides a CLI.

Package

function finder(words: Number[]): String[][])

Words that are not strings or are empty strings will be silently ignored.

Parameter Type Description
words String[] A list of words in which to find anagrams.
Returns String[][] A list of anagram group lists containing all found anagrams, sorted by number of group items descending.

CLI

[npx] anagram-finder [OPTIONS] [<file>]

The CLI can read the word list from <file> or from stdin.

Option Description
--help, -h Print usage information.
--stats, -s Print statistics instead of the normal output.

The CLI provides some feedback via the exit code.

Exit Code Description
0 Success.
1 Invalid argument configuration.
2 An unexpected error occurred.

Modes

The CLI operates in one of two modes.

  • Normal - Prints all anagram groups, ordered descending by group word count.
    $ anagram-finder wordlist
    koas, oaks, okas, soak
    adorer, roared
    
  • Statistics - Prints statistics about the run and the anagram groups.
    $ anagram-finder --stats wordlist
    Anagram Group Count: 2 groups
    Largest Group:       4 (koas, ...)
    Longest Words:       6 (adorer, ...)
    Timings:
      Read Words:        0.001s
      Process Words:     0s
    

Tests

A test suite is provided and can be run using the NPM test command.

npm test

Two projects are configured:

  • test - Runs the unit tests.
  • lint - Runs the lint rules against code and test files.

Coverage is configured and will print along with the test run output and will be written to the coverage/ folder.

Notes

  • Transpilation or just not using newer language features could make this work on more Node.js versions or on older browsers, but that is not one of the goals of this project.
  • Since there is no build step here, I haven't hooked this up to any Continuous Integration. During development, Jest's watch mode is sufficient.
  • A cool future change would be to switch the finder(...) function to accept an Iterable<String> and wrap the stream reads in an async generator so that the whole list doesn't have to be read into memory before processing it. It wouldn't completely avoid loading the words into memory, though, as we still have to collect them for the desired output.
  • The CLI is not tested at this time as it would require restructuring to allow for that. Generally, I don't like changing code just to make it more testable when there's no mechanical or functional motivation for doing so. At just over 200 lines, it is not long enough for me to want to start splitting it up for organizational or structural reasons.
  • The use of snapshots in Jest is sometimes contentious. My philosophy is generally that it is okay if the desire is to verify a complex structure where the structure and values matter the most. Earlier tests could have also used snapshots, but in those cases, the assertions were more specific and easy enough to state without doing so.
  • Hiding the message from unexpected errors might be desirable in the CLI, but I decided to leave it exposed for transparency. Maybe the user wants to know what the problem was so that they can contribute a pull request!