This is an R package for training and applying a machine learning classifier for the identification of series type (e.g. axial 3D T1, FLAIR, etc.), based on DICOM header data.
-
Make sure the
devtools
R package is installed.install.packages('devtools') library(devtools)
-
(optional) Install via local git repo. At a shell prompt:
git clone <repo_URL>
Enter login/password when prompted. Then from within R:
install_git('/path/to/repo')
-
(optional) Install via devtools with gitlab API token:
-
Login to gitlab and get a personal access token by navigating to Settings > Access Tokens.
-
Make the token available to R by adding it to your
.Renviron
file.echo "GITLAB_PAT=<token>" > ~/.Renviron
-
Restart R and install.
install_gitlab('jcolby/dcmclass', host='<hostname>')
-
-
(optional) Install via devtools with https:
install_git('<repo_URL>', git='external')
Enter login/password when prompted.
Load R package
library(dcmclass)
import_studies()
This function will take in a list of accession numbers, and then use the AIR API to download a single representative DICOM file from each series in each study.
Generate a gt_labels.csv
file with ground truth manual labels for the training set. For each accession, it should contain the true series number corresponding to FLAIR, T1, T1CE, and T2. For example, it may look like:
AccessionNumber,flair,t1,t1ce,t2
11111111,700,5,11,600
11111112,400,5,9,8
11111113,600,7,9,8
11111114,400,5,10,8
11111115,400,5,10,8
train_model()
This function will take in a training set (consisting of DICOM directories, generated above), and their ground truth labels (gt_labels.csv
, generated above), and train a classifier for their identification. The trained model can then be saved as a .Rdata
file for future use.
Because the gt_labels.csv
and model.Rdata
files contain patient derived and thus potentially identifiable information, we can't freely distribute these with this package. However, with these tools, you should be able to generate similar data for your needs.
predict_headers()
This function will apply our pre-trained model to predict new/unknown cases.