vhf/confusable_homoglyphs

python2: is_confusable cannot handle unicode preferred_aliases

muusik opened this issue · 2 comments

(Thanks for creating such a useful tool that confusable_homoglyphs is)
Since the return type of categories.alias() is unicode when running in Python 2, it would make sense that preferred_alias argument of confusables.is_confusable() would accept a list of same, but it requires list of strings instead, otherwise giving a TypeError. An example follows:

>>> from confusable_homoglyphs import confusables
>>> confusables.is_confusable('', preferred_aliases=[u'LATIN'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/confusable_homoglyphs/confusables.py", line 92, in is_confusable
    preferred_aliases = list(map(str.upper, preferred_aliases))
TypeError: descriptor 'upper' requires a 'str' object but received a 'unicode'

vhf commented

Nice, thanks for the issue and the fix!

I'm glad this lib is useful to you. Care to share what your use case for it?

Sure, our use case is closing a potential loophole in a plagiarism detection system. Thanks for pushing the update in such timely fashion.