Analyses related to EASY study at MSKCC
The initial implementations were done in MATLAB due to rapid prototyping demands. The functions are briefly described below.
Next to core MATLAB, the code requires the neuroelf toolbox (available at https://github.com/neuroelf/neuroelf-matlab).
This function will read in the annotation images, which must reside in a single folder and be named like this:
[STUDYTOKEN]_ISIC[NUMBER]_[ANNOTATOR]_[FEATURE].jpg
e.g.
EASYstudy1_ISIC0016092_BestAnnotator_Dots.jpg
The ISIC[NUMBER]
part is used to detect which image the annotation
is related to, and the [ANNOTATOR]
part is used to detect the
number of annotators per image, and finally the [FEATURE]
part is
used to group the data into features.
The first argument to the function is the folder in which easy_annotate
searches for the images. If the argument is not given, it defaults to
the folder in which the easy_annotate.m
file resides.
Next to the annotation (mask) images, the original ISIC images are
required as well, and also a studyfile (CSV) which defaults to
'Annotations/study1.csv'
. These files must be available in
the expected paths.
The study file contains information about the ISIC images included in that particular study. The information is given as one row per image, and the following fields, separated by comma, are present in the file:
[mongodb_ObjectId],ISIC_[NUMBER],[FEATURE_PLAIN],1
The ObjectId (e.g. 58b0a360d831137d0a388356
) allows a unique
addressing (and download) of the image from the ISIC archive.
The ISIC image number is contained in the ISIC_[NUMBER]
string,
allowing to store downloaded files with the correct filename. Please
note that in this format, an underscore separates the string
'ISIC'
from the actual number!
The feature name is given in plain English, and may contain spaces,
colons, slashes, etc., as in Structureless : Milky red areas
.
The trailing 1
is for future extensions.
The file used for the first EASY1 study is included in this repository!
In addition to the folder name (first argument), a second, optional, 1x1 struct argument can be provided that allows the user to override some of the default behavior:
The .colors
field is used for the main feature (which is given
in the study1.csv file for each image!). It must be at least 2 rows
of 1x3 RGB codes (0 to 255 values).
The .colorscale
field allows to scale the colors in a non-linear
way. The default is the expression log(1 + (0:24)) ./ log(25)
.
The .fullconf
field (either true or false, default true), tells
the function whether to consider any value in the mask > 0 as 100 per
cent confidence.
The .maxraters
field allows to override the number of maximum
raters per image (default: detected from all available annotation
mask images).
The .mcolors
field is another set of Cx3 colors (with at least
two additional colors necessasry). These colors will be used to color
the features besides the main features.
The .ofolder
field allows to set a separate folder into which all
output images will be written. The default is to use the same as the
input folder (first argument).
The function will write out montage images that contain (top row, left to right; bottom row, left to right):
- the original ISIC image
- the original ISIC image with the main (canonical) feature heatmap
- the original ISIC image with all features (mixed color heatmap)
- a heatmap color legend, with all levels of overlap, and further features
This function computes both the Dice and (smoothed-image) cross-correlation coefficients for a pair of images. In addition, it can return resampled images, which will speed up the computation for subsequent calls with the same images.
The syntax is:
[d, smcc, sim1, sim2] = easy_dice(im1, im2, sk, sim1, sim2)
whereas
- im1 and im2 are the original (mask) images
- sk is the smoothing kernel
- sim1 and sim2 are the resampled images (output from the function); for the first call, simply provide two empty arrays!
- d and smcc are the Dice and smoothed-image-cross-correlation values
This function uses easy_dice to compute confusion matrices, both using the Dice and SMCC measures, for pairs of images in the annotation dataset.
For now, the repository only contains a test-download function that demonstrates how to access the files from the ISIC archive in principle.
This can be executed in a terminal/console. You will be asked to enter a (valid) ISIC username (email) and password (which will not be shown in the console). It will then download the very first ISIC image and store it in a local file with the correct number.
The code can be used to write additional functions (or classes) that access the ISIC archive programmatically.
This notebook can be used to download the ISIC and annotation images
from one study. The notebook comes preconfigured with the
ISIC Annotation Study - All Features
study, ID
5a32cde91165975cf58a469c
.