Semantic reasoning over Wikidata.
Wikidata doesn't do reasoning/inference.
clone this repo.
build this docker image:
docker build --build-arg=uid=`id -u` --build-arg=gid=`id -g` -t justin2004/wikidata_reasoning .
start the docker container:
docker run --rm -it -v `pwd`:/mnt justin2004/wikidata_reasoning
then you should find output.csv.
edit the sparql queries and save them and see the results propogate to output.csv.
e.g. see this row in output.csv
s,p,o
http://www.wikidata.org/entity/Q76,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.wikidata.org/entity/Q5
which says that: president_obama rdf:type human
which is something that Wikidata does not say explictly but which is deriveable.
read Makefile
to see what is going on.
notice in the log output you see:
0/3, 1/3, 2/3, 3/3 in balloon letters.
these indicate how far back in the dependency graph (that the makefile encodes) we had to go to produce the output.csv
file.
Although this repo uses Wikidata it could be adapted to other SPARQL endpoints.
Only RDFS reasoning is used currently but this could be extended to include OWL reasoning.
This particular set of queries I think help address the criticism of Wikidata's Q numbers and P numbers.
Wikidata is essentially a giant obfuscated graph of nodes and edges. You have no idea of the semantic meaning of an edge without dereferencing the edge via a query or the wiki.
https://datalanguage.com/blog/wikidata-q41483
I think this reasoning helps because wdt:P31 (which people consider to be an obfsucated predicate) is rendered as rdf:type.
Look in output.csv
to see other non P number predicates.