/trie

Trie (a.k.a. prefix tree) C# implementation. Has constant-time string prefix lookup.

Primary LanguageC#

Trie

Trie (a.k.a. prefix tree) is an ordered tree data structure that is used to store an associative array where the keys are usually strings. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string.
Reference: Wikipedia – trie

Advantages

  • Looking up keys is faster. Looking up a key of length key takes O(|key|) time
  • Looking up prefixes is faster. Looking up a prefix takes O(|prefix|) time
  • Removing takes O(|key|) time

Tutorial

Trie<TValue> implements IDictionary<string, TValue> interface.

Trie initialization:

var trie = new Trie<TValue>();

or using constructor which accepts IComparer<char> interface:

var trie = new Trie<TValue>(comparer);

To add items to trie:

trie.Add("key", value);
trie.Add(new TrieEntry<TValue>("key2", value));
trie.AddRange(trieEntries);

The Add, AddRange methods throw ArgumentException if a value with the specified key already exists, however setting the Item property overwrites the old value. In other words, Trie<TValue> has the same behavior as Dictionary<TKey, TValue>.

The main advantage of trie is really fast prefix lookup. To find all items of Trie<TValue> which have keys with given prefix use GetByPrefix method which returns IEnumerable<TrieEntry<TValue>>:

var result = trie.GetByPrefix("ABC");

Benchmark tests

For performance tests I have used vocabulary with 58110 English words with length from 2 to 22 chars. On my PC loading all words to trie takes: 140 ms.
In following table are presented results of searching items by prefix:

Prefixes count: Time, ms
LINQ – StartsWith Trie – GetByPrefix
30 338 57
60 669 100
90 1001 193
120 1342 265
150 1678 251
180 2023 336
210 2380 383
240 2724 408
270 2995 453
300 3405 521
1500 16473 2366
3000 33263 4162

Tests using brute-force prefix generator.

Prefix length: Prefixes count: Time, ms
LINQ – StartsWith Trie – GetByPrefix
1 26 307 167
2 676 7675 163
3 17576 197057 168
4 456976 267
5 11881376 4632
------ © Kirill Polishchuk