/trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

No issues in this repository yet.