Information-Retrieval-System
Information Retrieval In Assamese Using WordNet & Assamese Wikipedia.
Work is published in:
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 09, SEPTEMBER2019. Click here: http://www.ijstr.org/final-print/sep2019/Information-Retrieval-In-Assamese-Using-Wordnet-Assamese-Wikipedia.pdf
Methodology:
The proposed IR system consist of following phases.
1. Structuring the query.
- Tokenization.
- Removal of punctuation and numbers.
- Removal of stop words.
- Stemming.
2. Use of Assamese Wordnet in query expansion.
3. Accessing information from Wikipedia.
- Set Language.
- Extract the wikipedia page titles.
- Extract the entire page.
This program requires additional resources like "Assamese Wordnet" file, "stop words in Assamese" file, "dictionary" file and "Assamese Stemmer" file which is developed by IT dept of Gauhati University Institute of Science and Technology.