the wikidata version and file
acadTags opened this issue · 2 comments
acadTags commented
Hi,
Many thanks for the repository. I am looking for the wikidata dump file you used for the annotation. Is it the same as in HIPE 2020 (https://files.ifi.uzh.ch/cl/siclemat/impresso/clef-hipe-2020/)?
Wondered if there is a link to wikidata file (or entity catalogue) with version specified?
Is the same wikidata dump version used for HIPE2020
and topRes19th
?
Best regards,
A
simon-clematide commented
Hi,
thanks for your interest. I'll try to answer your questions:
- topRes19th annotations were not done by us. You need to check with the corresponding project people https://bl.iro.bl.uk/concern/datasets/f3686eb9-4227-45cb-9acb-0453d35e6a03 and the documentation. For the HIPE 2022 edition of the topRes19th we use wikimapper to map their Wikipedia URLs to wikidata QIDs. We created a fresh mapping file for this from the latest wikidata dump available at the end of January 2022. The corresponding wikimapper index file is available for download under https://files.ifi.uzh.ch/cl/siclemat/hipe-2022/data/wikimapper/index_enwiki-latest.db
- for HIPE 2020 annotations we used the wikidata dump (download date 2019-11-13 ). You can download it here (80GB): https://files.ifi.uzh.ch/cl/siclemat/impresso/clef-hipe-2020/
HTH, Simon and Maud
acadTags commented
That's very helpful, thanks.