Duplicate forms returned for name entries
Closed this issue · 1 comments
As reported by email:
When looking up 堀口大學, 東京藝術大学 or 慶應大学 (all in JMnedict), both the 新字体 and 旧字体 forms appear twice in the 10ten popup
What these terms have in common is that the 旧字体 forms were originally separate entries (before being deleted and merged with the 新字体 forms earlier this year). Not sure why this would cause a duplication, though.
I was able to reproduce this in both the preview:
and the names tab:
For names, we combine entries with matching readings and translations so I guess when we process the 旧字体 variant we'll match the same entry and then merge it together.
10ten-ja-reader/src/background/name-search.ts
Lines 76 to 78 in c8b464d
We could either just make sure we take the unique set of kanji readings when we merge them:
Or we could track the IDs of the entries we've already matched and skip them.