Failed when formating UTF8 characters
brunoprietog opened this issue · 5 comments
Hi,
When I try to format a file that has UTF8 characters as á I get this error.
failed with exit code: 1. '/home/bruno/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/htmlbeautifier-1.4.2/bin/htmlbeautifier:12:in `rescue in beautify': Error parsing standard input: invalid byte sequence in US-ASCII on line 1 (RuntimeError) from /home/bruno/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/htmlbeautifier-1.4.2/bin/htmlbeautifier:9:in `beautify' from /home/bruno/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/htmlbeautifier-1.4.2/bin/htmlbeautifier:111:in `<top (required)>' from /home/bruno/.rbenv/versions/3.1.2/bin/htmlbeautifier:25:in `load' from /home/bruno/.rbenv/versions/3.1.2/bin/htmlbeautifier:25:in `<main>' /home/bruno/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/htmlbeautifier-1.4.2/lib/htmlbeautifier/parser.rb:37:in `rescue in dispatch': invalid byte sequence in US-ASCII on line 1 (RuntimeError) from /home/bruno/.rbenv/versions/3.1.2/lib/ruby/gems/3.1.0/gems/htmlbeautifier-1.4.2/lib/htmlbeautifier/parser.rb:31:in `dispatch' from /home/bruno/.rbenv/versio...
This is because LC_ALL=en_US.UTF-8 is set in the htmlbeautifier command.
If I run htmlbeautifier with htmlbeautifier file.html.erb file.html.erb
it works fine, but if I run it with LC_ALL=en_US.UTF-8 htmlbeautifier file.html.erb file.html.erb
it fails with the same error and an additional warning.
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
Thanks
This can be temporarily fixed by running sudo locale-gen en_US.UTF-8
. Anyway, it should not be necessary and it is not recommended to change the locales manually via environment variables.
@brunoprietog could you please also provide the file that has this issue so that I can reproduce it locally.
Any file that has a character such as á, for example.
<p>
Cómo estás?
</p>
You will only be able to reproduce the problem if you don't have the en_US.utf8 locale, which was my case.
bruno@DellBruno:~$ locale -a
C
C.utf8
POSIX
So, when the extension was trying to change to en_US.utf8 manually with the environment variable it was not possible.
When executing sudo locale-gen en_US.UTF-8
, now the output of locale -a
is
bruno@DellBruno:~$ locale -a
C
C.utf8
POSIX
en_US.utf8
And everything works fine.
Maybe one could check if the en_US.utf8 locale is available before using it?
Hi @brunoprietog, we release a new version that removes this language setting env var, and instead, I create a new config property for adding custom env var (if necessary).
See: https://github.com/aliariff/vscode-erb-beautify/releases/tag/v0.4.0