binder-project/binder

Unicode not supported in filenames

jankoslavic opened this issue · 1 comments

It looks like unicode is not supported in filenames.

To replicate the issue, please check out:
http://mybinder.org/repo/jankoslavic/pypinm

which is a binder to:
https://github.com/jankoslavic/pypinm

you can see that files with non ASCII characters are missing.

minrk commented

Jupyter will list/open/etc. files with non-ascii names as long as the locale is set correctly. It looks like the binder image doesn't set a locale, so you get:

main@frontend-server:~/notebooks$ locale                                                                                           
LANG=                                                                                                                              
LANGUAGE=                                                                                                                          
LC_CTYPE="POSIX"                                                                                                                   
LC_NUMERIC="POSIX"                                                                                                                 
LC_TIME="POSIX"                                                                                                                    
LC_COLLATE="POSIX"                                                                                                                 
LC_MONETARY="POSIX"                                                                                                                
LC_MESSAGES="POSIX"                                                                                                                
LC_PAPER="POSIX"                                                                                                                   
LC_NAME="POSIX"                                                                                                                    
LC_ADDRESS="POSIX"                                                                                                                 
LC_TELEPHONE="POSIX"                                                                                                               
LC_MEASUREMENT="POSIX"                                                                                                             
LC_IDENTIFICATION="POSIX"                                                                                                          
LC_ALL=                                                                                                                            

which means non-ascii things won't work (in notebooks or the terminal).

The Jupyter docker-stacks deal with this by calling locale-gen:

I opened a PR on the binder images to do the same.