PlagiarismScoreTFIDF

  1. There are 6 documents in the data directory
  2. Objective is to find the plagirism scores for each documents (1 - 5) with respect to the Query document
  3. First, we generare the TFIDF values with respect to all 6 documents
  4. Then we measure the cosine similarity between the each document and the Query document seperately
  5. Cosine similarity will give how similar a given document is to the Query document
  6. This cosine similarity measure is then represented as a plagiarism score