Supersenses extra
Closed this issue · 2 comments
Expanding on #138 now that I had a meeting with Bolette and Sussi about it.
- Comparing with the English WordNet, which uses
adj
rather thanadjective
for the adjective grouping, we should switch to match theirs. - Certain food-related words are tagged as
verb.creation
whereas they should beverb.consumption
like in the OEWN. Special rules for food may apply, perhaps in other cases too. - Once the supersenses are cleaned up and officially in DanNet, I can begin to attach supersenses to the Semdax corpus (CoNLL-U Format). About 7700 senses exist in this corpus that come from DanNet, while the rest are mostly from DDO (not DanNet) and can't be annotated using the Supersenses from DanNet. These will have to be done manually or using data from DDO.
Current status: I have binned the verb.creation
synsets into 14 separate groups based on their hypernyms. These and the remaining 9 synsets need to be checked by Sussi/Bolette.
I have remapped all of the verb.creation
synsets with Bolette. They will need to be imported into the graph and added to the dataset in the next release.
The next step is to do the ConNLL-U file for Danish found at: https://www.clarin.si/repository/xmlui/handle/11356/1842
Most of these are mapped to DanNet senses. Perhaps the supersense can be added at the end of the lines? I also need to figure out a smart way to do tag much of the remaining ~2500 senses, since they are not directly linked to DanNet. Some of them might be able to be added via the lemma.