/PharmaSearch

A Natural Language Search Enabled for Pharmaceutical research data. We aim to easily and efficiently find any search results using a word2vec encoder

Primary LanguageJupyter Notebook

TitleBanner

Introduction

Python-3.8 gensim-3.8.3 NLTK Flask PR PR

A Natural Language Search Enabled for Pharmaceutical research data. We aim to easily and efficiently find any search results using a word2vec encoder

Dataset

Research articles from various journals like for example, International Journal of Pharmaceutical Sciences, The National Medical Journal of India, La Revue de MĂ©decine Interne, MDPI Journals, RPS Journal and so on

Approach

  • Create shared vector space among word2vec representations of articles and search phrases
  • Make a seq2seq model to summarize and encode Research text documents
  • Find way to map research paper vectors to search phrase vectors
  • Create search engine using 1, 2 and 3
  • Build UI to house the search engine

Technology Used

  • Python
  • gensim
  • NLTK
  • Gensim: Gensim is a library used to develop scalable word2vec or doc2vec models which we would need to create a shared vector space for the input strings as well as the documents fed to it. It also comes packaged with several standard word2vec models which we would need for general vocabulary in our search.

About Us

  • Omkar Prabhune
  • Prabhav Pandya
  • Pritesh Pawar
  • Pranav Tambaku
  • Vaidehi Patil