This is the coolest thing ever!!!
bweir opened this issue · 6 comments
Thank you for making this!
wow. thanks!
(sorry, I'm not gonna fix this issue)
- filed as wontfix
This is probably the best place to ask instead of opening a new issue. I must know what tool/algorithm you used to create that convoluted beast of a RegEx! Please don't say you did this by hand :D
@pachacamac yes, that was built by hand :-)
But that's not that complicated as you might assume. First I also started with googling for an algorithm which would do that for me, but then I realized that the principle is simple. Roughly speaking, it makes sence to "factor out" in case if there are four or more words starting with the same letter:
lambda|let|lock|long
l(ambda|et|ock|ong) <-- one byte saved!
If we only have three or less words, there is no sence to factor out:
global|goto|guard
g(lobal|oto|uard) // same length
do|double
d(o|ouble) // even bigger
This was the most typical case for taking a decision. For others, in case of doubt I just compared the length.
Really wonder though if that could be done automatically. Huffman encoding comes to mind but right now I can't think of how to implement it nor if it's worth it, but its kind of interesting :D
EDIT: Almost. Not Huffman tree but https://en.wikipedia.org/wiki/Radix_tree
Keeps being cool!
Thanks for not fixing it :)