AskNowQA/LC-QuAD

190 predicates found in dataset missing in predicates.txt

Opened this issue · 3 comments

There are 190 unique predicates (occurring a total of 1931 times) that I found in the dataset that are not listed in resources/predicates.txt. See not_in_predicates.txt.

Also, 21 predicates listed in resources/predicates.txt do not occur in the dataset at all. See not_in_dataset.txt.

My bad. Got confused between relations vs classes for rdf:type (they share the same namespace). Ignore the "190 predicates part".

I presume that the latter (21 predicates from resources/predicates.txt ...) is still valid?
I'll take a look some time next week nonetheless.

@geraltofrivia yes, those 21 don't occur in the dataset (at least train-data.json, maybe both train+test).

Re predicates that are in the dataset but NOT in the predicates list - found these 3 (instead of 190):

(base) MacBook-Pro:lcquad nilesh$ grep -c 'http://dbpedia.org/property/portrayer' t*data.json 
test-data.json:0
train-data.json:3
(base) MacBook-Pro:lcquad nilesh$ grep -c 'http://dbpedia.org/ontology/superintendent' t*data.json 
test-data.json:0
train-data.json:2
(base) MacBook-Pro:lcquad nilesh$ grep -c 'http://dbpedia.org/property/influencedBy' t*data.json 
test-data.json:0
train-data.json:1