sysprog21/shecc

Optimize memory usage in trie by using radix tree

visitorckw opened this issue · 2 comments

The current Trie implementation allocates 128 entries for child nodes at each node, but in practice, C function names only consist of 63 distinct characters ('_', '0'-'9', 'a'-'z', 'A'-'Z'). This leads to inefficient memory usage, especially in large Trie structures where the underutilized space becomes more pronounced. To optimize memory usage, we can consider replacing the current Trie structure with a Radix Tree. A Radix Tree is a space-optimized version of a Trie that compresses nodes with a single child, thereby reducing the number of nodes and significantly lowering memory consumption. Additionally, this optimization could not only save memory but also potentially improve the performance of operations such as insertions and lookups by decreasing the depth of the tree.

How about trie-hard?

This issue is on my backlog, and I’ll find time to implement the optimization.

I took a brief look at trie-hard, and it seems to be used when a large number of misses are expected. However, it makes me wonder if our workload falls into this category as well?