/Search-engine-TF-IDF

Offline search engine for any corpus. Uses TF-IDF scores for ranking.

Primary LanguagePython

Search Engine

Uses td-idf scores and dot products to calculate similarity between user query taken as input and documents in the corpus that have been indexed.

Ranks according to similarity scores and displays top K most similar documents.

Input can be any kind of textual data that is supposedly present in the corpus.

Corpus can be collection of documents.