The shortest yet efficient PrefixSpan implementation in Python, in only 20 lines in core part.
Just replace the variable db
with your own sequences, and variable minsup
with your own minimum support threshold.
Based on state-of-the-art PrefixSpan algorithm. Mining top-k patterns is also supported.
I strongly encourage using PyPy instead of CPython to run the script for best performance. In my own experience, it is 9x times faster in average.