/NeuScraper

This is the code repo for our paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".

Primary LanguagePythonMIT LicenseMIT

Stargazers

No one’s star this repository yet.