An academic issues on "How to estimate the entity type distributions with relation class is not known"
typhoonlee opened this issue · 8 comments
According to your paper: you estimate the prior distributions over the candidate set C_sub and C_obj of potential entity types, according to a certain relation class, where the prior distributions are estimated by frequency statistics. But, how do you estimate the prior distributions with an unknown relation in the instance , just like “the chicken or the egg?”.
For example, the relation “per:country_of_birth” indicates the subject entity belongs to “person” and the object entity belongs to “country”. The prior distributions for C_sub can be counted as {"person":1} , but we should know this instance contains the relation "per:country_of_birth" in advance, then we can estimate the prior distributions of the candidate set.
In your example, we don't know this instance contains the relation "per:country_of_birth" in advance. Here we just initialize the virtual type words according to the pre-defined relation classes rather than estimate the prediction with the prior distributions. So there is no “the chicken or the egg?” problem here. You can refer to the "issue 遇到问题求助 #1" for specific examples of the prior distributions over the candidate set.
Thanks for your reply !
In your paper, you obtain the scope of the potential entity types with prior knowledge contained in a specific relation, so if we don't know the relation that the instance contains , how can we get the scope of the potential entity types and then estimate the prior distributions over the candidate set? I'm confused about this...
The calculation is according to pre-defined categories, for example, there are only two categories: "per:birth_of_place", and "per:birth_of_data". Thus, for [sub], p("person"): 1; for [obj],p("place"): 0.5, p("data"): 0.5. You can read code for a deeper understanding.
Thanks for your reply!
That's how I understood it before, so it means that the representation of virtual type words in each instance would be the same at the start? This is different from what is claimed in your paper that you can obtain the scope of the potential entity types with prior knowledge contained in a specific relation.
the representation of virtual type words are statistics initialized at the start, and what the model learns during training is the latent virtual type.
It means that the representation of virtual type words in each instance would be the same at the start, right?
Yes.
Thanks for your patience !