PharmaSearch: A Jupyter Notebook repository from OverPoweredDev

Introduction

A Natural Language Search Enabled for Pharmaceutical research data. We aim to easily and efficiently find any search results using a word2vec encoder

Dataset

Research articles from various journals like for example, International Journal of Pharmaceutical Sciences, The National Medical Journal of India, La Revue de Médecine Interne, MDPI Journals, RPS Journal and so on

Approach

Create shared vector space among word2vec representations of articles and search phrases
Make a seq2seq model to summarize and encode Research text documents
Find way to map research paper vectors to search phrase vectors
Create search engine using 1, 2 and 3
Build UI to house the search engine

Technology Used

Python
gensim
NLTK
Gensim: Gensim is a library used to develop scalable word2vec or doc2vec models which we would need to create a shared vector space for the input strings as well as the documents fed to it. It also comes packaged with several standard word2vec models which we would need for general vocabulary in our search.

About Us

Omkar Prabhune
Prabhav Pandya
Pritesh Pawar
Pranav Tambaku
Vaidehi Patil

OverPoweredDev/PharmaSearch

Introduction

Dataset

Approach

Technology Used

About Us