frankier/finn-wsd-eval

Can use more of a frame derived clustering with multiple synsets than just deleting the synsets entirely

frankier opened this issue · 0 comments

E.G. if we have

laskea.02,00948071-v
laskea.02,00712556-v
laskea.02,02731632-v
laskea.02,00685081-v
laskea.03,02645839-v
laskea.03,02731632-v
laskea.03,00685081-v
laskea.03,00950431-v
laskea.06,01938426-v

we get a valid clustering by deleting the duplicates as we do now

laskea.02,00948071-v
laskea.02,00712556-v
laskea.03,02645839-v
laskea.03,00950431-v
laskea.06,01938426-v

but we could get an additional clustering by deleting the rest of the contents of the affected clusters and merging them

laskea.02or03,02731632-v
laskea.02or03,00685081-v
laskea.06,01938426-v

or is a more principled approach could be to take all possible choices of cluster for each duplicated synset, convert to graphs and find + remove contradictions, then find cliques. I think this is a bit different since it might refuse to say whether 02731632-v and 00685081-v are in the same cluster so would generate two clusterings

laskea.02or03,02731632-v
laskea.06,01938426-v

and

laskea.02or03,00685081-v
laskea.06,01938426-v