/Information-Retrieval-System

Information Retrieval In Assamese Using WordNet & Assamese Wikipedia

Primary LanguagePython

Information-Retrieval-System

Information Retrieval In Assamese Using WordNet & Assamese Wikipedia.

Work is published in:

INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 09, SEPTEMBER2019. Click here: http://www.ijstr.org/final-print/sep2019/Information-Retrieval-In-Assamese-Using-Wordnet-Assamese-Wikipedia.pdf

Methodology:

The proposed IR system consist of following phases.

1. Structuring the query.

  • Tokenization.
  • Removal of punctuation and numbers.
  • Removal of stop words.
  • Stemming.

2. Use of Assamese Wordnet in query expansion.

3. Accessing information from Wikipedia.

  • Set Language.
  • Extract the wikipedia page titles.
  • Extract the entire page.

This program requires additional resources like "Assamese Wordnet" file, "stop words in Assamese" file, "dictionary" file and "Assamese Stemmer" file which is developed by IT dept of Gauhati University Institute of Science and Technology.