We are on a journey to uncover the hidden facts lying regarding the COVID-19 researches. We are not sure what we are going to find but we will discover what potential the existing research can serve.
Ideas Tested :
-
Sample Corpus Collected
-
Cleaning Scripts Ready - Reusable and Reproducble
-
Tf-Idf Summary
-
LDA on unigram + bigram + trigram via Coherence and PMI
-
MALLET LDA on same as above using CV and UMass score for ideal number of topics.
-
Applications on Whole Corpus
-
Similar Paper Clustering
-
Paper Searching Index
-
IR
-
Research and Findings