CTPUG/wafer

Talks with non-Latin titles generate empty slugs

Opened this issue · 6 comments

When I create a talk with a non-Latin title, for example Тестовый доклад, the generated slug is an empty string, which prevents the user submitting the talk from saving it:

NoReverseMatch at /talks/new/

Reverse for 'wafer_talk' with keyword arguments '{'pk': 2, 'slug': ''}' not found. 1 pattern(s) tried: ['talks/(?P<pk>\\d+)(?:-(?P<slug>[\\w-]+))?/$']

You may want to use e.g. transliterate module to transliterate the title for the purpose of generating the slug.

Alternatively the pattern can be modified to allow an empty slug.

drnlm commented

Crashing on an empty slug is something we should fix.

We also should do something better with unicode talk titles. I'm not sure what the best approach is though.

People can add arbitrary unicode characters in the talk title anyway, so I don't think transliterate is a good solution here, given it's design goals.

We could use django.text.utls allow_unicode option and create urls with unicode characters - that should work, although the percent-encoded url isn't going to be particularly readable

Something more like text-unidecode is an alternative, which will produce more ascii friendly urls, although that will also produce not particularly meaningful character strings for many languages.

Why do you think transliteration isn’t a good solution? A lot of CMS do exactly that.

drnlm commented

I mean that the "transliterate" module isn't a good solution (https://pypi.org/project/transliterate/) - it supports a very limited set of languages, requires knowing the language of the title ahead of time to get good results and doesn't support arbitary non-language unicode characters sensibly.

Nothing prevents someone from creating a talk titled "♠♡♢", for example, and we need to handle that with whatever solution we come up with.

Talks now have a language attribute. We could transliterate from that language.