attardi/wikiextractor

Issues on newer (2023) and older (2019) dumps

JohnTailor opened this issue · 0 comments

Wikextractor failed on new dumps (bz2) issues and older dumps (would only extract 4 GB of texts). Is this a known issue? Also I cannot find the exact asked for dump from 2020. Any link where to find it?