wikiextractor
There are 4 repositories under wikiextractor topic.
CrisJk/Agriculture-KnowledgeGraph-Data
对知识库Wikidata的爬虫以及数据处理脚本 将三元组关系对齐到语料库的脚本 获取知识图谱数据的脚本
shyamupa/wikidump_preprocessing
Extracting useful metadata from Wikipedia dumps in any language.
studerw/wiki-dump-parser
Java tool to Wikimedia dumps into Java Article pojos for test or fake data.
TomerAberbach/wikipedia-ngrams
📚 A Kotlin project which extracts ngram counts from Wikipedia data dumps.