Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation.
If you use or extend our work, please cite our paper at ***.
@misc{liu2024factual,
title={Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation},
author={Kang Liu and Zhuoqi Ma and Mengmeng Liu and Zhicheng Jiao and Xiaolu Kang and Qiguang Miao and Kun Xie},
year={2024},
eprint={2405.09586},
archivePrefix={arXiv},
primaryClass={eess.IV}
}
torch==2.1.2+cu118
transformers==4.23.1
torchvision==0.16.2+cu118
- Due to the specific environment of RadGraph, please refer to
knowledge_encoder/factual serialization. py
for the environment of the structural entities approach.
You can download checkpoints of FSE as follows:
-
For
MIMIC-CXR
, you can download checkpoints from here, and its code isMK13
. -
For
IU X-Ray
, you can download checkpoints from here, and its code isMK13
.
We use two datasets (IU X-Ray and MIMIC-CXR) in our paper.
-
For
IU X-Ray
, you can download medical images from here. -
For
MIMIC-CXR
, you can download medical images from here.
NOTE: The IU X-Ray
dataset is of small size, and thus the variance of the results is large.
There have been some works using MIMIC-CXR
only and treating the whole IU X-Ray
dataset as an extra test set.
-
Config RadGraph environment based on
knowledge_encoder/factual_serialization.py
===================environmental setting================= Basic Setup (One-time activity)a. Clone the DYGIE++ repository from here. This repository is managed by Wadden et al., authors of the paper Entity, Relation, and Event Extraction with Contextualized Span Representations.
git clone https://github.com/dwadden/dygiepp.git
b. Navigate to the root of repo in your system and use the following commands to set the conda environment:
conda create --name dygiepp python=3.7 conda activate dygiepp cd dygiepp pip install -r requirements.txt conda develop . # Adds DyGIE to your PYTHONPATH
c. Activate the conda environment:
conda activate dygiepp
-
Config
radgraph_model_path
andann_path
inknowledge_encoder/see.py
. The former can be downloaded from here, and the latter,annotation.json
, can be obtained from here. Note that you can apply with your license of PhysioNet. -
Set the local path in
config/finetune_config.yaml
for images and checkpoints, such asmimic_cxr_image_dir
andchexbert_model_checkpoint
-
Run the
knowledge_encoder/factual_serialization.py
to extract factual serialization for each sample.
Notably,chexbert.pth
can download from here. distilbert-base-uncased
can download from here. bert-base-uncased
can download from here. radgraph
can download from here. scibert_scivocab_uncased
can download from here.
Run bash pretrain_mimic_cxr.sh
to pretrain a model on the MIMIC-CXR data.
- Config
--load
argument inpretrain_inference_mimic_cxr.sh
- Run
bash pretrain_inference_mimic_cxr.sh
to retrieve similar historical cases for each sample, formingmimic_cxr_annotation_sen_best_reports_keywords_20.json
.
- Config
--load
argument infinetune_mimic_cxr.sh
- Run
bash finetune_mimic_cxr.sh
to generate reports based on similar historical cases.
-
You must download the medical images, their corresponding reports (i.e.,
mimic_cxr_annotation_sen_best_reports_keywords_20.json
), and checkpoints (i.e.,finetune_model_best.pth
) in Section Datasets and Section Checkpoints, respectively. -
Config
--load
and--mimic_cxr_ann_path
arguments intest_mimic_cxr.sh
-
Run
bash test_mimic_cxr.sh
to generate reports based on similar historical cases. -
Results (i.e., FSE-5,
$M_{gt}=100$ ) on MIMIC-CXR are presented as follows:
- R2Gen Some codes are adapted based on R2Gen.
- R2GenCMN Some codes are adapted based on R2GenCMN.
- MGCA Some codes are adapted based on MGCA.
[1] Chen, Z., Song, Y., Chang, T.H., Wan, X., 2020. Generating radiology reports via memory-driven transformer, in: EMNLP, pp. 1439–1449.
[2] Chen, Z., Shen, Y., Song, Y., Wan, X., 2021. Cross-modal memory networks for radiology report generation, in: ACL, pp. 5904–5914.
[3] Wang, F., Zhou, Y., Wang, S., Vardhanabhuti, V., Yu, L., 2022. Multigranularity cross-modal alignment for generalized medical visual representation learning, in: NeurIPS, pp. 33536–33549.