/Idioms-Detector

Primary LanguageJupyter Notebook

Idioms-Detector

The "Idioms Extractor" project focuses on the challenge of tagging idioms in text using a database of over 9,878 idioms. It employs tools like Google Colab, Python3, and Spacy. The methodology involves two phases: the first uses exact matching of idioms, while the second uses phrase extraction and similarity scoring. The project achieved accurate idiom detection but faced limitations like high false positive rates and challenges in verb phrase detection and handling complex sentences. Future work includes expanding phrase detection and refining the code for better accuracy. The project highlights both the complexity and potential of natural language processing in idiom detection.

Refer to "idiomExtractor Report.pdf" for a detailed idea

Intuition

Screenshot 2024-03-29 at 5 31 03 PM

Solution

Screenshot 2024-03-29 at 5 28 19 PM