/JavaWikipediaMiner

menggunakan jwikipediaMiner api

Primary LanguageJava

Mining Wikipedia-id using WikipediaMiner

This project is trying to create a module for mining Wikipedia-id using wikipediaminer module

main project file is quite messy. so it contains a bunch of things such as :

  • Get a page of wikipedia-id and convert into clean txt file
  • Get all page id and tittle and store it to tsv file
  • Create Corpus with all page
  • Get Info box from page
  • Get list of page from certain category
  • Annotating term for entity (Location, Organization) through all page

This project is still making error in term of result such as when you're trying to create corpus with all page, there's still some unwanted term or char remain