Suffix trees are useful for efficient string searching of suffixes and substrings. They're often used in bioinformatics on genomes. One big advantage is that you can search for the same suffix across many strings in linear time. This is an optimized implementation using Ukkonen's algorithm and requires O(n) time and O(n) space to construct a tree for a string of length n.
You can can check whether a string is a substring of another, whether a string is a suffix of another (starting from any point), the number of occurrences of a substring, or you can find the longest repeated substring. You can also do these operations on a set of multiple strings in linear time.
var STree = require('@jayrbolton/suffix-tree')
var tree = STree.create('banana')
var suffixIndex = STree.findSuffix('ana') // -> 3
var substringIndex = STree.findSubstring('ana') // -> 1
var occurrences = STree.occurrences('ana') // -> 2
var longestRepeating = STree.longestRepeating() // -> 'ana'
You can view all tested strings in test/index.js
-- if you have a string that you want to include in the test suite, please open an issue or pr.
Initiialize a suffix tree. Returns a suffix tree object.
Optionally pass in an initial string to add to the tree
STree.create('banana')
Add another, separate string to the tree. This will result in a single, larger tree that combines all strings into one tree.
Mutates the given tree and returns it.
const t = STree.create('banana')
STree.add('plantain', t)
Return a formatted string for a tree to see how it looks. This creates an left indentation-based tree.
console.log(STree.foramt(tree))
Find a given suffix for any string in the tree. Will return an array of indexes of strings that have the suffix. Returns an empty array if the suffix is not found.
You can get the string from its index by using getStringByIndex
const ids = STree.findSuffix('ana', tree)
Return the original string based on its index
const t = STree.create('banana')
STree.add('plantain', t)
STree.getStringByIndex(0) // -> 'banana'
STree.getStringByIndex(1) // -> 'plantain'
Return an array of arrays of ALL suffixes for the entire tree. Traverses every path of the tree
Suffix trees have a lot of other efficient functions that can be written for them. See the Wikipedia page for a comprehensive list of these functions. They could either be added to this module or created in independent modules.
With npm installed, run
$ npm install @jayrbolton/suffix-tree
- Visualization of Ukkonen's algorithm
- Explanation of Ukkonen's algorithm
- Wikipedia - Ukkonen's algorithm
- Wikipedia - Suffix Trees
MIT