/boilerpipe-rs

Rust port of the boilerpipe Java library used for the removal of boilerplate and extraction of text content from HTML documents.

Primary LanguageHTMLMIT LicenseMIT

Boilerpipe

This is the Rust port of the Golang port of excellent Java library boilerpipe which cleans up the boilerplate and extracts text content from HTML documents.

This library implements Article Extractor only and text content only (no images, links etc).