/sftm_tree_matching

Tree Matching algorithm allowing to match two trees. Specialized in web page DOM trees

Primary LanguageHTMLGNU General Public License v3.0GPL-3.0

Tree Matcher

A Tree Matching library specialized in matching web pages. The library is based on the SFTM algorithm:

Installation

The library is available on Maven central: https://mvnrepository.com/artifact/io.github.amaris/sftm-tree-matching

References

If you use this work for academic purposes, please cite the following paper:

Brisset, Sacha, et al. "SFTM: Fast Comparison of Web Documents using Similarity-based Flexible Tree Matching." arXiv preprint arXiv:2004.12821 (2020).