/TreeMiner

Algorithms for Mining Frequent Trees (in Tree Structured Datasets)

Primary LanguageC++Apache License 2.0Apache-2.0

TreeMiner Algorithm

Treeminer uses the vertical scope-list representation to mine frequent sequences [2005-treeminer:tkde]. Treeminer counts all embeddings, whereas Treeminer-D counts only distinct occurrences, which can be more appropriate for some datasets. Also included is the horizontal format based PatternMatcher approach.

Relevant Publications

  • [2005-treeminer:tkde] Mohammed J. Zaki. Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Transactions on Knowledge and Data Engineering, 17(8):1021–1035, August 2005. special issue on Mining Biological Data. doi:10.1109/TKDE.2005.125.

  • [2002-treeminer] Mohammed J. Zaki. Efficiently mining frequent trees in a forest. In 8th ACM SIGKDD International Conference Knowledge Discovery and Data Mining. July 2002.

See also https://github.com/zakimjz/SLEUTH that extends the TreeMiner methodology to mine all frequent embedded or induced as well as ordered or unordered tree patterns.

For the Tree data generator see https://github.com/zakimjz/TreeGen