the fundamental algorithm used for url normalization,web page classification and web information integration in web seach engine The idea of this algorithms came from A Pattern Tree-based Approach to Learning URL Normalization Rules(from WWW),and I made several modifications according to the actual application.
yymwater/url-pattern-algorithm
the fundamental algorithm used for url normalization,web page classification and web information integration in web seach engine
Java