This is an open-source DL-based classification model that tries to identify patients with COVID-19 viral infection, non-COVID-19 infection, and those with no infection with high accuracy by analyzing their chest x-ray scans. This project is part of COVID-Net open source initiative. This is a prototype model, not intended to be used yet in production. You can reach out to me directly on LinedIn if you have questions about this model or want help to getting it used as an experimental tool for COVID-19 near real-time screening.
- The architecture was adapted from ShuffleNet v2
- Designed with efficiency in mind to allow near real-time screening with mobile devices
- Experimented with transfer learning and data augmentation techniques to improve generalizability and robustness
- This project is built using open-source software, where PyTorch was used as the main AI framework
- Training and testing datasets relied on COVIDx dataset which consists of chest x-ray images from 3 publicly available data.
- For model evaluation, I relied on the images listed in
test_COVIDx2.txt
as the blind testset used for evaluation.
-
data
directory contains 3 important scripts:create-db.py
is a slightly modified verion of the original instructions that was used to get the maintrain
andtest
datafolders. Make sure you do this first.create-trainsets.py
is uesed to create thetrainset
which includestrn
andval
subfolders with images arranged by class. There is code that can be used to get a balanced version of the data that you can later on pass directly to an augmentation method.create-covidx2-testset.py
is used to get theCOVIDx2_test
dataset which this and the other COVIDNet models use as a blind testsettrain_split_v3.txt
contains the list of images used to create thetrainset
. Thetest_split_v3.txt
was never used sincetest_COVIDx2.txt
is a subset of it.- For more details on COVIDx dataset and the original instructions, checkout the COVID-Net repo.
-
models
directory contains the pretrained models. -
project requirements can be found in
requirements.txt
. -
train.py
contains the entire training pipeline, which includes dataloaders, preprocessing, augmentations, hyperparameters, main train loop, and model saving mechanism. -
test.py
contains a simple prediction script that reads a directory of images created bycreate-covidx2-testset.py
.
- covidnet-cxr-shuffle-e18