Can use more of a frame derived clustering with multiple synsets than just deleting the synsets entirely
frankier opened this issue · 0 comments
E.G. if we have
laskea.02,00948071-v
laskea.02,00712556-v
laskea.02,02731632-v
laskea.02,00685081-v
laskea.03,02645839-v
laskea.03,02731632-v
laskea.03,00685081-v
laskea.03,00950431-v
laskea.06,01938426-v
we get a valid clustering by deleting the duplicates as we do now
laskea.02,00948071-v
laskea.02,00712556-v
laskea.03,02645839-v
laskea.03,00950431-v
laskea.06,01938426-v
but we could get an additional clustering by deleting the rest of the contents of the affected clusters and merging them
laskea.02or03,02731632-v
laskea.02or03,00685081-v
laskea.06,01938426-v
or is a more principled approach could be to take all possible choices of cluster for each duplicated synset, convert to graphs and find + remove contradictions, then find cliques. I think this is a bit different since it might refuse to say whether 02731632-v and 00685081-v are in the same cluster so would generate two clusterings
laskea.02or03,02731632-v
laskea.06,01938426-v
and
laskea.02or03,00685081-v
laskea.06,01938426-v