Repository contains code for various Natural Language Processing tasks and the TF-IDFx Algorithm developed by me.
TF-IDF stands for Term Frequency-Inverse Document Frequency
TF-IDF in short:
FOR A CERTAIN QUERY -The value of Term Frequency(TF) changes with respect to each document
but the value of Inverse Document Frequency(IDF) remains the same (Each term has a fixed IDF value-across documents)
as it depends on the full corpus of documents.
Thus for getting the similarity to a certain document: [TF(for that document) * IDF(Common to all docs)]
The TF-IDFx Algorithm is a modified version of the TF-IDF algorithm that is used to check the similarity of the words/query to a set of documents!