UKPLab/emnlp2017-relation-extraction

evaluate problem,need help!

DeligientSloth opened this issue · 9 comments

confuse about your code, empty_label=keras_models.p0_index and p0_index = 1

Does P0 means no relationship between entity pair?

a sample in your dataset:
{"kbID": "P0", "right": [15], "left": [9]}]

Yes, exactly!

Yes, exactly!

thanks for your reply!! Further more, how do you construct those entity pairs without relationship? random sample from entity with relationship? or those entity pairs without relationship exactly exists in Wiki? or just combine entities to entity pairs in a sentence?

Just combine entities in the sentence into pairs and check if there is relation between them in the KB. If not, label as „no relation“. Then randomly subsample the negative examples to balance the overall class distribution.

Just combine entities in the sentence into pairs and check if there is relation between them in the KB. If not, label as „no relation“. Then randomly subsample the negative examples to balance the overall class distribution.

Thanks a lot! your open code helps me a lot!

I have another question while train CNN model, an error occurs,my command code is:
model_CNN train ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-training.02_06.json ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-validation.02_06.json

error is shown as follow:
super(MaskedConvolution1D, self).init(**kwargs)
Traceback (most recent call last):
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1659, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes must be equal rank, but are 3 and 0 for 'masked_global_max_pooling1d_1/Select' (op: 'Select') with input shapes: [?,36,1], [?,36,256], [].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/EMNLP2017/relation_extraction/model_train.py", line 138, in
model = getattr(keras_models, model_name)(keras_models.model_params, embedding_matrix, max_sent_len, n_out)
File "\EMNLP2017\relation_extraction\core\keras_models.py", line 183, in model_CNN
sentence_vector = MaskedGlobalMaxPooling1D()(x)
File "C:\Program\anaconda3\lib\site-packages\keras\engine\base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "\EMNLP2017\relation_extraction\core\keras_models.py", line 544, in call
return K.max(tf.where(mask[:,:,np.newaxis], x, -np.inf ), axis = 1)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3204, in where
return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 8568, in select
"Select", condition=condition, t=x, e=y, name=name)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1823, in init
control_input_ops)
File "C:\Program\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1662, in _create_c_op
raise ValueError(str(e))
ValueError: Shapes must be equal rank, but are 3 and 0 for 'masked_global_max_pooling1d_1/Select' (op: 'Select') with input shapes: [?,36,1], [?,36,256], [].

Process finished with exit code 1

I notice that some entities(left&right) non-continuous tokens, such as:"left": [0, 1, 18, 19], seems that 0,1 tokens represent "The", "game", 18,19 tokens represent "the", "game". They are the same entity, so "left" or "right" in json file contain all index of the same entity, is it right?

an example:
"edgeSet": [{"left": [0, 1, 18, 19], "right": [27], "kbID": "P279"}], "tokens": ["The", "game", "has", "a", "unique", "engine", "in", "which", "the", "protagonist", ",", "Lester", ",", "is", "easily", "frightened", "early", "in", "the", "game", "and", "will", "act", "reluctantly", "when", "faced", "with", "animal", "s", ",", "height", "s", ",", "etc", "."]}

Hi,

yes, the token ids may include multiple occurrences of the same entity.

as to your error message, I will have to look into it next week. Could you please try to train another type of model and see if it works?

Best wishes
Daniil

thanks, your MaskedConvolution1D class doesn't implement call method. i am not familiar with keras, will conduct parent class's call method automatically?

class MaskedConvolution1D(layers.Convolution1D):
def init(self, **kwargs):
self.supports_masking = True
super(MaskedConvolution1D, self).init(**kwargs)

def compute_mask(self, x, mask=None):
    return mask

Hi! Yes, the call method of the parent is used in this case.