fedwiki/wiki

Punycode subdomains

almereyda opened this issue · 5 comments

If we find internationalized subdomains like for the testwiki of our friend François,

who is supposed to be the source of the term degrowth,
they are parsed to Punycode and only then passed to wiki, leading to the ugly display of

within the address bar of our companion's browser.

Should we use some sort of JavaScript URL rewriting to make at least the user believe to be able to use Non-ASCII characters in the desired address of a newly to be created wiki?

Just to impose on the importance of internationalization efforts, reading those domains can get really ugly:

http://réponse.à.françois.tries.fed.wiki/

will turn into

http://xn--rponse-bva.xn--0ca.xn--franois-xxa.tries.fed.wiki/

making me wonder if I'm not actually looking at pr0n. weird

Suggest that you look at IDN in Google Chrome, but it also covers the other browsers.

This is the browser protecting you from a homograph attack by displaying the address using punycode were the address contains characters that are not in the language your browser is configured to use.

For example, using http://françois.wiki.allmende.io/ and chrome, if my browser does not have French in its language list, or any other language that uses ç, the address will be displayed in punycode. Add French to the language list, and it is displayed as an IDN.

Thank you for bringing this to our attention and researching the details. I learn something every day.

The wiki code protects itself with a number of unicode intolerant regular expressions. Relaxing these conservative decisions will require care and invention on our part.

@almereyda Are you aware of a way to get the browser to behave differently, or is this just browser behavior? My guess is that this is behavior all websites (and people who use IDN websites in a script not in their language list) get. I think @paul90 may be right, and we may be able to just close this one.

You are probably right. Thanks for digging this deep into this.