/Multi-OpenEA

Repository For [ICASSP2023] [VISION, DEDUCTION AND ALIGNMENT: AN EMPIRICAL STUDY ON MULTI-MODAL KNOWLEDGE GRAPH ALIGNMENT]

LODEME

Pytorch ICASSP023

Entity alignment (EA) for knowledge graphs (KGs) plays a critical role in knowledge engineering. Existing EA methods mostly focus on utilizing the graph structures and entity attributes (including literals), but ignore images that are common in modern multi-modal KGs. In this study we first constructed Multi-OpenEA β€” eight large-scale, image-equipped EA benchmarks, and then evaluated some existing embedding-based methods for utilizing images. In view of the complementary nature of visual modal information and logical deduction, we further developed a new multi-modal EA method named LODEME using logical deduction and multimodal KG embedding, with state-of-the-art performance achieved on Multi-OpenEA and other existing multi-modal EA benchmarks.

πŸš€ Code for LODEME

The code is currently being organized and refined. Once the code is ready, it will be made available on this repository. Thank you for your patience and understanding.

πŸ“š Dataset (Multi-OpenEA)

We proposed a generic multi-modal EA benchmarks construction process and constructed new multi-modal EA benchmarks based on the eight existing OpenEA benchmarks by adding multiple images to each entity.

Our Multi-OpenEA benchmarks vs the existing multi-modal EA benchmarks. Ours have larger scale (#Entity), more enti- ties associated with images (Coverage), and more images per entity (Ratio).

Benchmark KGs #Entity #Images Ratio Coverage Similarity
FB15K-DB15K \cite{chen2020mmea} FB15K 14,951 13,444 0.899 90.0% -
DB15K 12,842 12,837 0.999 99.9%
DBP-WD(norm) \cite{liu2021visual} DBP 15,000 8,517 0.517 57.1% -
WD 15,000 8,791 0.586 58.6%
EN-FR-15K-V1 EN15K(V1) 15,000 44,657 2.977 99.7% 0.757
FR15K(V1) 15,000 42,286 2.819 94.5%
EN-FR-15K-V2 EN15K(V2) 15,000 44,932 2.995 99.9% 0.767
FR15K(V2) 15,000 42,622 2.841 94.5%
EN-FR-100K-V1 EN100K(V1) 100,000 296,934 2.969 99.6% 0.751
FR100K(V1) 100,000 280,288 2.803 94.1%
EN-FR-100K-V2 EN100K(V2) 100,000 299,403 2.994 99.9% 0.752
FR100K(V2) 100,000 282,063 2.821 94.4%
D-W-15K-V1 DBP15K(V1) 15,000 44,776 2.985 99.8% 0.829
WD15K(V1) 15,000 44,823 2.988 99.8%
D-W-15K-V2 DBP15K(V2) 15,000 44,911 2.994 99.9% 0.820
WD15K(V2) 15,000 44,945 2.996 99.9%
D-W-100K-V1 DBP100K(V1) 100,000 296,749 2.9867 99.5% 0.833
WD100K(V1) 100,000 297,354 2.974 99.6%
D-W-100K-V2 DBP100K(V2) 100,000 299,338 2.993 99.9% 0.832
WD100K(V2) 100,000 299,607 2.996 99.9%

❗NOTE: The organisation of the data is consistent with OpenEA Dataset v1.1 and the text portion can be downloaded directly from OpenEA. Download the image embedding via CLIP encoding from Baidu Cloud Drive, with the pass code tuds and the raw images from Baidu Cloud Drive, with the pass code aoo1

🀝 Cite:

Please condiser citing this paper if you use the code or data from our work. Thanks a lot :)

@inproceedings{li2023vision,
  title={Vision, Deduction and Alignment: An Empirical Study on Multi-Modal Knowledge Graph Alignment},
  author={Li, Yangning and Chen, Jiaoyan and Li, Yinghui and Xiang, Yuejia and Chen, Xi and Zheng, Hai-Tao},
  booktitle={ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={1--5},
  year={2023},
  organization={IEEE}
}

πŸ’‘ Acknowledgement

  • We appreciate OpenEA, PRASE, EVA, and many other related works for their open-source contributions.