About structrue-enhanced self-attention

I saw in your paper that the adjacency matrix is generated from the central triplet as structural information and added to self-attention, because your code is very confusing and I can't find the corresponding code, can you tell me the code for this part where is it？

Hi, in this code version, we directly put strucural information as adjacency matrix in the self-attention module for simplicity.
It is easy to put structural encoder (an extra linear to encode the power of attention matrix).
Because the structural encoder is actually like a hyper-parameter and you can choose to use this module which is dependable on the actual knowledge graph you choose.
We will complete this part in the future version and also release the code version for relation prediction.

you can find the implementation in the BertSelfAttention module (321 line.) in huggingface_relformer.py,

Hi, thank you for your patient reply. I also want to ask why I can't enter BertSelfAttention in huggingface_relformer.py when I start from the main function for single-step debugging?

Did you first pre-train the model for initialization and then load the pre-trained model?
The hyper-parameters of two stage are different and the second stage will import the huggingface_relformer.py.

When performing pre-training or entity prediction, some files need to be downloaded, but it is very slow, so I downloaded it from https://huggingface.co/bert-base-uncased, am I right? The following is the hyperparameter setting for my running entity prediction.

Yes, that is right. I think I find the different point.
You should run the "main.py" in the "Relphormer" directory instead of the same file in the "pre-train" directory.

Sorry, I didn't present clearly, I just put the pre-trained model bert-base-uncased in the "pretrain" directory, this is run the "main.py" in the "Relphormer" directory.

ok, can you check the program by putting the debug point in [from models.huggingface_relformer import [BertForMaskedLM]]
It imports the "models.huggingface_relformer" file.

Sorry, I can only find the class BertForMaskedLM, but I can't find the code [from models.huggingface_relformer import [BertForMaskedLM]]

Can you directly import the huggingface_relformer.py file? Because we rewrite the BertForMaskedLM class in this file.

Do you mean to create a new file and try to import the huggingface_relformer.py file?

Yes, you can try do that in the second training stage and it will also make sense.

Single-step debugging of the main function to trainer.fit(lit_model, datamodule=data) will start training, unable to enter huggingface_relformer.py

You can find the line which imports the model class.

Relphormer/main.py

Line 51 in 7ea60b5

model_class = _import_class(f"models.{temp_args.model_class}")

model_class = _import_class(f"models.{temp_args.model_class}")

The hyperparameter setting of args.model_class is BertKGC. Will the model in huggingface_relformer.py be called during the execution of BertKGC?

Relphormer/models/model.py

Lines 1 to 7 in 7ea60b5

    
           # from transformers.models.bert.modeling_bert import BertForMaskedLM 
        
           from models.huggingface_relformer import BertForMaskedLM 
        
           class BertKGC(BertForMaskedLM): 
        
               @staticmethod 
        
               def add_to_argparse(parser): 
        
                   parser.add_argument("--pretrain", type=int, default=0, help="") 
        
                   return parser

The model_class BertKGC is inherited from BertForMaskedLM in huggingface_relformer.py

Doesn't the attention_mask in the BertSelfAttention module (line 321) in huggingface_relformer.py actually build? I saw that attention_mask first came from the image below:

Hi, we generate the attention mask for each center triple and you can find in the input of the module.

Can you tell me exactly where in the input module?

Relphormer/data/processor.py

Lines 319 to 327 in 7ea60b5

    
           masked_head_seq = set() 
        
           masked_head_seq_id = set() 
        
           masked_tail_seq = set() 
        
           masked_tail_seq_id = set() 
        
           masked_tail_graph_list = masked_tail_neighbor["\t".join([line[0],line[1]])] if len(masked_tail_neighbor["\t".join([line[0],line[1]])]) < max_triplet else \ 
        
               random.sample(masked_tail_neighbor["\t".join([line[0],line[1]])], max_triplet) 
        
           masked_head_graph_list = masked_head_neighbor["\t".join([line[2],line[1]])] if len(masked_head_neighbor["\t".join([line[2],line[1]])]) < max_triplet else \ 
        
               random.sample(masked_head_neighbor["\t".join([line[2],line[1]])], max_triplet)

	# from transformers.models.bert.modeling_bert import BertForMaskedLM
	from models.huggingface_relformer import BertForMaskedLM
	class BertKGC(BertForMaskedLM):
	@staticmethod
	def add_to_argparse(parser):
	parser.add_argument("--pretrain", type=int, default=0, help="")
	return parser

	masked_head_seq = set()
	masked_head_seq_id = set()
	masked_tail_seq = set()
	masked_tail_seq_id = set()

	masked_tail_graph_list = masked_tail_neighbor["\t".join([line[0],line[1]])] if len(masked_tail_neighbor["\t".join([line[0],line[1]])]) < max_triplet else \
	random.sample(masked_tail_neighbor["\t".join([line[0],line[1]])], max_triplet)
	masked_head_graph_list = masked_head_neighbor["\t".join([line[2],line[1]])] if len(masked_head_neighbor["\t".join([line[2],line[1]])]) < max_triplet else \
	random.sample(masked_head_neighbor["\t".join([line[2],line[1]])], max_triplet)