We all know, as the cases count are increasing day by day bulk testing isn't getting feasible and faster at this rate. This inspired me to create a probabilistic analysis whether a person is infected or not keeping in mind with the list of symptoms as said by the World Health Organization (WHO). 👨🏻⚕️
The Research Paper can be found here: Identification of COVID-19 can be quicker through artificial intelligence framework
Attributes | Description |
---|---|
Age | Patient Age |
Gender | Patient Gender (Female=0/Male=1) |
Location | Patient is from COVID-19 affected area or not (No=0/Yes=1) |
Fever | Patient Fever (No=0/Yes=1) |
Dry Cough | Patient Dry Cough (No=0/Yes=1) |
Fatigue | Patient Fatigue (No=0/Yes=1) |
Pains | Patient Pains (No=0/Yes=1) |
Nasal Congestion | Patient Nasal Congestion (No=0/Yes=1) |
Problem in Breathing | Patient Breathing Problem (No=0/Yes=1) |
Sore Throat | Patient Sore Throat (No=0/Yes=1) |
Headache | Patient Headache (No=0/Yes=1) |
Vomiting | Patient Vomiting (No=0/Yes=1) |
Runny Nose | Patient Runny Nose (No=0/Yes=1) |
Diarrhea | Patient Diarrhea (No=0/Yes=1) |
The model used for probabilistic analysis for COVID-19 infection is Logistic Regression
- Why is Logistic Regression preferred over other models?
No algorithm is generally the best one but in this case, as the output of this problem statement will give either Yes/No, so LR is preferred for Binary Classification. On the other hand, tree based approach were much slower and accuracy is lower than LR.
Check the notebook for the detailed analysis, here
Follow me on Twitter