/chatnoir-warc-dl

This pipeline allows extracting data from WARC files on a CPU cluster and streaming it to a GPU server, where it is processed.

Primary LanguagePythonMIT LicenseMIT

Stargazers