Reproducible Paper - Diagnosis of patients with blood cell count for COVID-19: An explainable artificial intelligence approach
An Open Source code released to increase reproducibility in academic and professional research.
- Title: Diagnosis of patients with blood cell count for COVID-19: An explainable artificial intelligence approach
- Access Link: here
- Journal: Journal of Health Informatics (JHI)
- ISSN: 2175-4411
- Journal Impact Factor (QUALIS): B5 for Computer Science and Engineering
- Kaike Wesley Reis (corresponding author): LinkedIn and Lattes
- Karla Patricia Oliveira-Esquerre: LinkedIn and Lattes
Disclaimer: The supplementary material provided in this repository contains extra analysis and discussions compared to the paper discussion. This decision was made to make the results presented more focused and objective for the paper.
-
AI model: Random Forest
-
Overall parameters (including hyperparameters):
{'bootstrap': True, 'ccp_alpha': 0.0, 'class_weight': 'balanced_subsample', 'criterion': 'gini', 'max_depth': 21, 'max_features': 'sqrt', 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 1000, 'n_jobs': None, 'oob_score': False, 'random_state': 1206, 'verbose': 0, 'warm_start': True}
- The
.ipynb
was used to developed the source material related to this paper. - The numbers at the beginning of each notebook represent the pipeline order:
0
and1
for Pre-processing2
for Model development3
for Results evaluation (model selection)4
for Qualitative analysis
- The original dataset can be found here at Kaggle's platform.
The requirements.txt
was generated outside a virtual environment and then presents all packages installed on the machine without exception. Given this fact, I separated the main packages, followed by their versions, used for this paper:
Package | Version |
---|---|
numpy | 1.18.1 |
pandas | 1.0.5 |
missingno | 0.4.2 |
matplotlib | 3.1.3 |
seaborn | 0.10.0 |
scikit-learn | 0.23.1 |
skopt | 0.8.dev0 |
xgboost | 1.1.1 |
scipy | 1.4.1 |
joblib | 0.14.1 |
shap | 0.38.0 |
umap | 0.4.6 |
imbalanced-learn | 0.7.0 |