CSProm-KG

Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting

Overview of KG-S2S

This repository includes the source code of the paper accepted by ACL 2023 Findings.

"Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting".

Dependencies

Compatible with PyTorch 1.11.0+cu113 and Python 3.x.
Dependencies can be installed using requirements.txt.

Dataset:

We use WN18RR, FB15k-237, ICEWS14, ICEWS05-15 and Wikidata5m dataset for knowledge graph link prediction.
The preprocessed WN18RR, FB15k-237, ICEWS14 and ICEWS05-15 are included in the ./data/processed/ directory, except for Wikidata5m due to its large size. Processd Wikidata5m can be found here. Alternatively, you can download the raw dataset into ./data/raw/ and run the corresponding scripts to generate the processed data. The raw data source are collected and can be downloaded here.
Raw data source:
- WN18RR
- FB15k-237
- ICEWS14, Dataverse
- ICEWS05-15, Dataverse (The additional data can be downloaded via the 'Access Dataset' button)
- Wikidata5m_transductive, Wikidata5m_text, Wikidata5m_alias (The contents of Wikidata5m_transductive and Wikidata5m_text need to be combined into single folder for preprocessing.)

Pretrained Checkpoint:

To enable a quick evaluation, we upload the trained model. Download the checkpoint folders to ./checkpoint/, and run the evaluation commandline for corresponding dataset.

The results are:

Dataset	MRR	H@1	H@3	H@10
WN18RR	0.572660	52.06%	59.00%	67.79%
FB15k-237	0.357701	26.90%	39.07%	53.55%
Wikidata5m	0.379789	34.32%	39.91%	44.57%
ICEWS14	0.627971	54.74%	67.73%	77.30%
ICEWS05-15	0.626890	54.27%	67.84%	78.22%

Training and testing:

Install all the requirements from ./requirements.txt. pytorch 1.11.0+cu113 is installed with pip install torch==1.11.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

Commands for reproducing the reported results:

WN18RR

python3 main.py -dataset WN18RR \
                -batch_size 128 \
                -pretrained_model bert-large-uncased \
                -desc_max_length 40 \
                -lr 5e-4 \
                -prompt_length 10 \
                -alpha 0.1 \
                -n_lar 8 \
                -label_smoothing 0.1 \
                -embed_dim 144 \
                -k_w 12 \
                -k_h 12 \
                -alpha_step 0.00001


# evaluation commandline:
python3 main.py -dataset WN18RR \
                -batch_size 128 \
                -pretrained_model bert-large-uncased \
                -desc_max_length 40 \
                -lr 5e-4 \
                -prompt_length 10 \
                -alpha 0.1 \
                -n_lar 8 \
                -label_smoothing 0.1 \
                -embed_dim 144 \
                -k_w 12 \
                -k_h 12 \
                -alpha_step 0.00001 \
                -model_path path/to/trained/model

FB15k-237

python3 main.py -dataset FB15k-237 \
                -batch_size 128 \
                -pretrained_model bert-base-uncased \
                -epoch 60 \
                -desc_max_length 40 \
                -lr 5e-4 \
                -prompt_length 10 \
                -alpha 0.1 \
                -n_lar 8 \
                -label_smoothing 0.1 \
                -embed_dim 156 \
                -k_w 12 \
                -k_h 13 \
                -alpha_step 0.00001 

# evaluation commandline:
python3 main.py -dataset FB15k-237 \
                -batch_size 128 \
                -pretrained_model bert-base-uncased \
                -desc_max_length 40 \
                -lr 5e-4 \
                -prompt_length 10 \
                -alpha 0.1 \
                -n_lar 8 \
                -label_smoothing 0.1 \
                -embed_dim 156 \
                -k_w 12 \
                -k_h 13 \
                -alpha_step 0.00001 \
                -model_path path/to/trained/model

Wikidata5m

python3 main.py -dataset wikidata5m_transductive \
                -batch_size 450 \
                -pretrained_model bert-base-uncased \
                -epoch 20 \
                -desc_max_length 40 \
                -lr 1e-4 \
                -prompt_length 5 \
                -label_smoothing 0 \
                -hid_drop 0.1 \
                -hid_drop2 0.1 \
                -feat_drop 0.1 \
                -embed_dim 180 \
                -k_w 10 \
                -k_h 18 

# evaluation commandline:
python3 main.py -dataset wikidata5m_transductive \
                -batch_size 450 \
                -pretrained_model bert-base-uncased \
                -desc_max_length 40 \
                -lr 1e-4 \
                -prompt_length 5 \
                -label_smoothing 0 \
                -hid_drop 0.1 \
                -hid_drop2 0.1 \
                -feat_drop 0.1 \
                -embed_dim 180 \
                -k_w 10 \
                -k_h 18 \
                -model_path path/to/trained/model

ICEWS14

python3 main.py -dataset ICEWS14 \
                -batch_size 384 \
                -pretrained_model bert-base-uncased \
                -epoch 300 \
                -desc_max_length 40 \
                -embed_dim 128 \
                -lr 5e-4 \
                -prompt_length 5 \
                -alpha 0.1 \
                -n_lar 8 \
                -label_smoothing 0.1 \
                -gamma 0 \
                -embed_dim 144 \
                -k_w 12 \
                -k_h 12 


# evaluation commandline:
python3 main.py -dataset ICEWS14 \
                -batch_size 384 \
                -pretrained_model bert-base-uncased \
                -desc_max_length 40 \
                -embed_dim 128 \
                -lr 5e-4 \
                -prompt_length 5 \
                -alpha 0.1 \
                -n_lar 8 \
                -label_smoothing 0.1 \
                -gamma 0 \
                -embed_dim 144 \
                -k_w 12 \
                -k_h 12 \
                -model_path path/to/trained/model

ICEWS05-15

python3 main.py -dataset ICEWS05-15 \
                -batch_size 384 \
                -pretrained_model bert-base-uncased \
                -desc_max_length 40 \
                -lr 1e-4 \
                -prompt_length 5 \
                -label_smoothing 0.1 \
                -hid_drop 0.2 \
                -hid_drop2 0.2 \
                -feat_drop 0.2 \
                -embed_dim 180 \
                -k_w 10 \
                -k_h 18



# evaluation commandline:
python3 main.py -dataset ICEWS05-15 \
                -batch_size 384 \
                -pretrained_model bert-base-uncased \
                -desc_max_length 40 \
                -lr 1e-4 \
                -prompt_length 5 \
                -label_smoothing 0.1 \
                -hid_drop 0.2 \
                -hid_drop2 0.2 \
                -feat_drop 0.2 \
                -embed_dim 180 \
                -k_w 10 \
                -k_h 18 \
                -model_path path/to/trained/model

Citation

If you used our work or found it helpful, please use the following citation:

@inproceedings{chen-etal-2023-dipping,
    title = "Dipping {PLM}s Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting",
    author = "Chen, Chen  and
      Wang, Yufei  and
      Sun, Aixin  and
      Li, Bing  and
      Lam, Kwok-Yan",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-acl.729",
    pages = "11489--11503",
    abstract = "Knowledge Graph Completion (KGC) often requires both KG structural and textual information to be effective. Pre-trained Language Models (PLMs) have been used to learn the textual information, usually under the fine-tune paradigm for the KGC task. However, the fine-tuned PLMs often overwhelmingly focus on the textual information and overlook structural knowledge. To tackle this issue, this paper proposes CSProm-KG (Conditional Soft Prompts for KGC) which maintains a balance between structural information and textual knowledge. CSProm-KG only tunes the parameters of Conditional Soft Prompts that are generated by the entities and relations representations. We verify the effectiveness of CSProm-KG on three popular static KGC benchmarks WN18RR, FB15K-237 and Wikidata5M, and two temporal KGC benchmarks ICEWS14 and ICEWS05-15. CSProm-KG outperforms competitive baseline models and sets new state-of-the-art on these benchmarks. We conduct further analysis to show (i) the effectiveness of our proposed components, (ii) the efficiency of CSProm-KG, and (iii) the flexibility of CSProm-KG.",
}

chenchens190009/CSProm-KG