The dbo:spouse / dbp:spouse information should be extracted as an array
Opened this issue · 1 comments
Issue validity
See:
http://dief.tools.dbpedia.org/server/extraction/en/extract?title=Joe+Biden&revid=&format=trix&extractors=custom
and
http://dbpedia.org/resource/Joe_Biden
Error Description
Looking at http://dbpedia.org/resource/Joe_Biden we can see several bad triple patterns:
dbo:spouse
dbr:Jill_Biden
dbr:1972_United_States_Senate_election_in_Delaware
dbr:Neilia_Hunter_Biden
dbp:spouse
1966-08-27 (xsd:date)
1972-12-18 (xsd:date)
1977-06-17 (xsd:date)
dbr:Jill_Biden
dbr:Neilia_Hunter_Biden(en)
died (en)
It looks like the extractor cartridge for Person does not parse the spouse information as an array.
Also the dbr:1972_United_States_Senate_election_in_Delaware
also indicates bad parsing.
Pinpointing the source of the error
Details
I believe the code should be changed to use the same pattern as for the dbo:termPeriod e.g.
dbo:spouse
dbr:Joe_Biden__Spouse__1
dbr:Joe_Biden__Spouse__2
dbr:Joe_Biden__Spouse__3
It is not completely clear for me, so how should the triples look like? Should we leave those triples:
dbr:Joe_Biden dbo:spouse dbr:Jill_Biden
dbr:Joe_Biden dbo:spouse dbr:Neilia_Hunter_Biden
?
Also, as I see, we should add this kind of triples for each of the Joe_Biden spouses:
dbr:Joe_Biden dbo:spouse dbr:Joe_Biden__Spouse__1
dbr:Joe_Biden dbo:spouse dbr:Joe_Biden__Spouse__2
And do we need to remove:
dbp:spouse
1966-08-27 (xsd:date)
1972-12-18 (xsd:date)
1977-06-17 (xsd:date)
died (en)
and
dbo:spouse
dbr:1972_United_States_Senate_election_in_Delaware
?
And we should not use dbp:spouse
at all, am I right?