A Tree Matching library specialized in matching web pages. The library is based on the SFTM algorithm:
The library is available on Maven central: https://mvnrepository.com/artifact/io.github.amaris/sftm-tree-matching
If you use this work for academic purposes, please cite the following paper:
Brisset, Sacha, et al. "SFTM: Fast Comparison of Web Documents using Similarity-based Flexible Tree Matching." arXiv preprint arXiv:2004.12821 (2020).