#Simple Information Retrieval System (SIRS)
The Simple Information Retrieval System is a product of the Data Science Group at the University of Notre Dame. The focus of this project is to provide an educational search engine system that emphasizes explanation over speed and efficiency.
More information will be made available is the system is developed.
#Components
As in any production-quality search engine there are several components that are necessary to create a simple information retrieval system.
##Web Crawler
Web Crawling is a necessary part of any search engine, but is outside the scope of what SIRS explores. Nevertheless, a simple Website crawler is made available in the edu.nd.sirs.websitesearch
package. The CrawlerProcess
uses Crawler4j to download Web pages to a local folder on disk.
##Document
##Parser
###Tokenizer
##Indexer
###Inverted Index
###Direct Index
##Query
##Retrieval Models
###Boolean Model
##Search Engine Web Application