Treeminer uses the vertical scope-list representation to mine frequent sequences [2005-treeminer:tkde]. Treeminer counts all embeddings, whereas Treeminer-D counts only distinct occurrences, which can be more appropriate for some datasets. Also included is the horizontal format based PatternMatcher approach.
Relevant Publications
-
[2005-treeminer:tkde] Mohammed J. Zaki. Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Transactions on Knowledge and Data Engineering, 17(8):1021–1035, August 2005. special issue on Mining Biological Data. doi:10.1109/TKDE.2005.125.
-
[2002-treeminer] Mohammed J. Zaki. Efficiently mining frequent trees in a forest. In 8th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. July 2002.
See also https://github.com/zakimjz/SLEUTH that extends the TreeMiner methodology to mine all frequent embedded or induced as well as ordered or unordered tree patterns.
For the Tree data generator see https://github.com/zakimjz/TreeGen