Barspoon transformers are a transformer architecture for multilabel prediction tasks for application in histopathological problems, but easily adaptable to other domains. It closely follows the transformer architecture described in Attention Is All You Need, slightly adapted to enable multi-label prediction for many labels without loss of accuracy, even for a large number of potentially noisy labels. For more detailed information on the architecture, refer to the model's definition.
To install barspoon, run
pip install git+https://github.com/LocalToasty/barspoon-transformer
To properly leverage your GPU, you my need to manually install PyTorch as described on their website.
In the following, we will give examples of how to use barspoon to do some common-place prediction tasks in histopathology. We assume our dataset to consist of multiple patients, each of which has zero or more histopathological slides assigned to them. For each patient, we have a series of target labels we want to train the network to predict.
We initially need the following:
- A table containing clinical information, henceforth the clini table. This
table has to be in either csv or excel format. It has to have at least one
column
patient
, which contains an ID identifying each patient, and other columns matching clinical information to that patient. - Features extracted from each slide, generated using e.g. KatherLab's end-to-end feature extraction pipeline.
- A table matching each patient to their slides, the slide table. The slide
table has two columns,
patient
andfilename
. Thepatient
column has to contain the same patient IDs found in the clini table. Thefilename
column contains the file paths to features belonging to that patient. Eachfilename
has to be unique, but onepatient
can be mapped to multiplefilename
s.
barspoon-gen-target-file \
--clini-table path/to/clini.csv \
--category msi --category stage \
--quantize leucocyte-fraction 3 \
--output-file targets.toml
barspoon-train \
--output-dir path/to/save/results/to \
--clini-table path/to/clini.csv \
--slide-table path/to/slide.csv \
--feature-dir dir/containing/features \
--target-file path/to/target.toml