#Wikidump cleaning script From http://mattmahoney.net/dc/textdata.html bottom of the page, script that cleans wiki dump data.
klintan/wikidump-xml-clean
Program to filter Wikipedia XML dumps to "clean" text. Written by Matt Mahoney, June 10, 2006 http://mattmahoney.net/dc/textdata.html
Perl