siznax/wptools

Give page wikidata more structure

siznax opened this issue · 0 comments

Currently, our page.data['wikidata'] dictionary is flat. For example:

{u'taxon name (P225)': u'Amanita',
 u'taxon rank (P105)': u'genus (Q34740)',
 u'taxonomic type (P427)': u'Fly agaric (Q131227)'}

We should make it less flat and more like Wikidata statements:

[{'property': 'P225',
  'label': u'taxon name',
  'value': u'Amanita'},
 {'property': 'P105',
  'label': u'taxon rank',
  'value': u'genus (Q34740)'},
 {'property': 'P427',
  'label': u'taxonomic type',
  'value': u'Fly agaric (Q131227)'}]

Note that the value of 'value' can be a string, list, or dictionary, so it may further complicate the structure if we expand values in the same way.

And we should provide useful methods to query wikidata that return entire statements or lists of statements:

>>> page.wikidata('P2')
{'property': 'P225',
 'label': u'taxon name',
 'value': u'Amanita'}

>>> page.wikidata('rank')
{'property': 'P105',
 'label': u'taxon rank',
 'value': u'genus (Q34740)'}

>>> page.wikidata('agaric')
{'property': 'P427',
 'label': u'taxonomic type',
 'value': u'Fly agaric (Q131227)'}

>>> page.wikidata('taxon')
[{'property': 'P225',
  'label': u'taxon name',
  'value': u'Amanita'},
 {'property': 'P105',
  'label': u'taxon rank',
  'value': u'genus (Q34740)'},
 {'property': 'P427',
  'label': u'taxonomic type',
  'value': u'Fly agaric (Q131227)'}]