--sparql-endpoint / -e for http://localhost:8989/bigdata/sparql (custom Wikibase instance) returns data from Wikidata
dbs opened this issue · 4 comments
Running Wikibase locally, I can generate results via curl:
curl http://localhost:8989/bigdata/sparql?SELECT%20DISTINCT%3Fp%20WHERE%20%7B%20%3Fs%20%3Fp%20%3Fo%20%7D
But trying to query the same endpoint with wikidata-taxonomy returns data from Wikidata instead:
node wdtaxonomy.js Q3 --sparql-endpoint http://localhost:8989/bigdata/sparql
life (Q3) •188 ↑↑↑
├──extraterrestrial life (Q181508) •81 ×1 ↑
│ ├──life on Mars (Q601319) •34 ×1
│ ├──Martian (Q913850) •25 ×4
│ ├──Life on Titan (Q2591050) •15
│ └──extraterrestrial intelligence (Q15107669) •7
├──personal life (Q2867027) •20
└──human life (Q19771042) •3
I get the same result if I install wikidata-taxonomy globally with npm install -g
It's late night so I'll toss a theory: does it implicitly depend on properties such as P279 existing in the target endpoint, and it falls back to Wikidata if the query to the specified endpoint doesn't return the expected data?
Answering my own question, it does indeed rely on properties, but we are given options for mapping those properties to our own instances:
-m/--mappings
allows you to specify the property that corresponds to P1709 ("equivalent class") in your Wikibase instance-P/--property
allows you to specify the properties that correspond to P279 ("subclass of") and P31 ("instance of") in your Wikibase instance (and both are required)
And these are required if you're using a Wikibase instance.
However, this still doesn't resolve my problem - I'm still getting results back from Wikidata instead of the Wikibase instance.
So on my Wikibase instance, where the WD property P279 maps to P297, and WD P31 maps to P28, and WD P1709 maps to P251, the following request:
wdtaxonomy --sparql-endpoint http://localhost:9292/bigdata/sparql -P P297,P28 -m P251 -s Q46
generates the following query:
SELECT ?item ?broader ?itemLabel ?instances ?sites ?mapping ?mappingProperty WITH {
SELECT DISTINCT ?item { ?item wdt:P297* wd:Q46 }
} AS %items WHERE {
INCLUDE %items .
OPTIONAL { ?item wdt:P297 ?broader } .
{
SELECT ?item (count(distinct ?element) as ?instances) {
INCLUDE %items.
OPTIONAL { ?element wdt:P28 ?item }
} GROUP BY ?item
}
{
SELECT ?item (count(distinct ?site) as ?sites) {
INCLUDE %items.
OPTIONAL { ?site schema:about ?item }
} GROUP BY ?item
}
{
SELECT ?item ?mapping ?mappingProperty {
INCLUDE %items .
OPTIONAL {
{ ?item wdt:P251 ?mapping . BIND('P251' AS ?mappingProperty) }
}
}
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en"
}
}
And if I run that query against my instance, I get the expected results for the partial import of human/person on my instance (just the top few lines included for space):
item | broader | itemLabel | instances | sites | mapping | mappingProperty
-- | -- | -- | -- | -- | -- | --
http://wikibase.svc/entity/Q46 | http://wikibase.svc/entity/Q421 | human | 1 | 0 | http://schema.org/Person | P251
http://wikibase.svc/entity/Q46 | http://wikibase.svc/entity/Q421 | human | 1 | 0 | http://dbpedia.org/ontology/Person | P251
http://wikibase.svc/entity/Q421 | http://wikibase.svc/entity/Q425 | person | 0 | 0 | http://xmlns.com/foaf/0.1/Person | P251
http://wikibase.svc/entity/Q421 | http://wikibase.svc/entity/Q425 | person | 0 | 0 | http://id.loc.gov/ontologies/bibframe/Person | P251
However, if I drop the -s
parameter to run that query against my Wikibase instance:
wdtaxonomy --sparql-endpoint http://localhost:9292/bigdata/sparql -P P297,P28 -m P251 Q46
Instead of seeing the hierarchy for Q46 ('human') from my instance, I am instead shown the taxonomy for Q46 ('Europe') drawn from Wikidata--clearly using the property mappings that break the hierarchy:
Europe (Q46) •350
So, still trying to figure out why the --sparql-endpoint
parameter appears to be being ignored. Maybe a louder warning or error message might help identify whatever I'm still doing wrong?
Thanks for notification - actually I've never tested the --sparql-endpoint
option so it was broken, sorry for that. Can you please check out the latest version from source and try again?
Thanks, that appears to connect properly to the SPARQL endpoint, so this specific issue can be closed.
However, it never returns any results, which should probably be the subject of a new issue. My suspicion is that is because the default namespace for Wikibase is http://wikibase.svc/ but lib/query.js hardcodes http://www.wikidata.org/; similarly lib/query.js has a hardcoded reference to P31, lib/serialize.js refers to http://www.wikidata.org/entity/P31 for occurrence counts, and lib/mappings.js also has a number of hardcoded relational P-ids.
It seems like adapting this wonderful tool to truly support Wikibase would require a fair bit of change to support all of the required mappings dynamically (either at the command line or by passing in a file of the mappings). Perhaps that's not feasible in the short term.
Thanks again, I should better try with a Wikibase instance of my own. Your analysis is correct, to fix this it requires:
- configure query
- send query to custom endpoint
- process result as configured for custom Wikibase
- serialize result as configured for custom Wikibase
And options should better be read from config file (#46)