/vector-space-model

This is Information Retrieval HW1

Primary LanguagePython

README

Introduction

  • This is the Information Retrieval HW1
  • Using TF-IDF to compute the relation between given querys and documents

Approach

  • Term frequence use Log Normalization
  • Inverse document frequency use Inverse Frequency Smooth

      • : 1 + total number of documents
      • : 1 + number of documents that contain the word
      • plus one to avoid divide zero
  • TF-IDF for document and query