Umlauts within phrase are causing odd intent matches
Corasonn opened this issue · 1 comments
Some of my entity values contain umlauts. When I want to recognize them with a specific intent, snips matches it so any other intent that also contains this entity. But the right intent would fit 100%. With any other value without an umlaut, snips will match the right intent with 1.0 score.
Expected:
Intents with entities with umlauts are matched correctly.
Environment:
- OS: OSX 10.15.5
- python version: 2.7
- snips-nlu version: 0.20.1
I found the problem. When I have more than 10000 entity values, snips doesn't build some entity variations due to a better building performance.
PR was: #804
Unfortunately, it seems to break umlauts when the "case" variation is missing. I forked the project and changed it hardcoded (https://github.com/Corasonn/snips-nlu).
I'm not a python developer, so if someone knows how to set it via flag, it would be great!