/information-retrieval

An implementation of the boolean model and the vector space model on the CACM collection

Primary LanguageJupyter Notebook

information-retrieval

An implementation of an information retrieval system on the CACM collection. We implemented the boolean model and the vector space model. Four similarity functions were implemented for the former (dot product, cosinus, Dice, Jaccard). In addition, a GUI was developped to test the vector space model on the query set proposed in the CACM collection. The metrics for the test were : recall, average precision metric and 11pt average precision metric. We also plot the precision-recall curve and the interpolated precision-recall curve. Feel free to check the report (in french) for more details about the implementation.

Alt text