This library can be used to parse html page into java object using xml-based rules
<!-- https://mvnr/epository.com/artifact/com.github.borsch/base-crawler -->
<dependency>
<groupId>com.github.borsch</groupId>
<artifactId>base-crawler</artifactId>
<version>2.0.0</version>
</dependency>
Example with description can be found here and complete project can be found here