/wikidump-xml-clean

Program to filter Wikipedia XML dumps to "clean" text. Written by Matt Mahoney, June 10, 2006 http://mattmahoney.net/dc/textdata.html

Primary LanguagePerl