hdaSprachtechnologie/odenet

Add dc:subject to synsets

Closed this issue · 4 comments

In en-word.net, each synset has an ILI and an attribute called dc:subject. Subjects correspond to the lexicographer files in Princeton wordnet. They are broad semantic categories. In GermaNet, they are called "Semantische Felder" (semantic fields).

I propose to add dc:subject to OdeNet. It can easily be done by matching OdeNet with en-word.net via ILI. IMO it would be a great help.

One use case is when checking ILIs. Detecting a wrong ILI is one thing, finding and assigning a correct new ILI might be difficult or even impossible. The first step should be to make sure that a synset is assigned to the appropriate dc:subject. And it would be more helpful than just deleting an incorrect ILI.

The Excel file attached contains a table with columns ILI, subject and more.
It is the result of a join between the following datasets:

Attachment:
ILI mapping enriched with dc_subject OdeNet 1.3.xlsx

Together with Johann Bergh, we are currently automatically correcting ilis. If this is finished, it will be a good idea to add subjects as well. Give me a moment, please.

Thanks for the feedback and good luck with the ILIs!