Reducing Burden

Data Fusion using machine learning is essential for integrating multimodal data. We explored different multimodal fusion methods including early, late, and model fusion to classify radiological images and associated text reports. We investigated the amount of labeled data needed for comparable performance by multimodal and unimodal models. We also gauged the performance of multimodal models vs unimodal models. Furthermore, we compared the performance between the different multimodal models: early, late, and model fusion. Our experiments showed the potential of multimodal fusion to produce competitive results using less training data when compared to their unimodal counterparts which was more noticeable with the early fusion method. Overall, our results suggest data fusion techniques may decrease the burden of annotation by physicians.

Preliminary Results:

Model	N=1				N=2				N=4
	F1	Precision	Recall	AUC	F1	Precision	Recall	AUC	F1	Precision	Recall	AUC
Early Fusion	0.928	0.93	0.927	0.972	0.893	0.894	0.893	0.952	0.876	0.877	0.876	0.934
Image Only	0.637	0.64	0.636	0.674	0.617	0.619	0.617	0.629	0.616	0.622	0.615	0.639
Late Fusion	0.92	0.925	0.919	0.946	0.86	0.872	0.858	0.884	0.83	0.85	0.827	0.859
Model Fusion	0.88	0.905	0.878	0.982	0.86	0.882	0.858	0.961	0.844	0.869	0.841	0.938
Text Only	0.906	0.916	0.904	0.977	0.866	0.885	0.864	0.954	0.843	0.861	0.841	0.925

kevkid/Reducing_Burden

Reducing Burden

Preliminary Results: