/Asthma-Clustering

Using NLP techniques in Python to cluster scientific papers based on similar content

Primary LanguageJupyter Notebook

Clustering asthma-related papers from the CORD-19 dataset

The idea behind this project can be found in my medium article: https://mpoikaterina.medium.com/exploring-areas-of-investigation-in-covid-19-and-asthma-e5458509efbc

I use Natural Language Processing (NLP) techniques in Python, to explore topics of research between asthma and coronaviruses before the identification of SARS-CoV-2, but also after the outbreak of the pandemic.The analysis is based on clustering scientific publications, in order to create groups of papers with similar topics. The goal of forming these clusters for "coronaviruses/asthma" vs. "SARS-CoV-2/asthma" is to get an overview of topics the reserach community focuses on before and after the outbreak of the pandemic.