Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.
Primary LanguageJava