carp-lang/Carp

On some systems Carp's repl doesn't print out correctly non-ASCII grapheme

IrrenWirr opened this issue · 6 comments

Reproducible everytime with my Gentoo 64 bits linux.

a “file.carp”

(IO.print "ŋóŧ α∫çᵢᵢ λ 🐂 🎲 : ASCII-COMPATIBLE string")

When sending it to Carp's repl

> (IO.print "ŋóŧ α∫çᵢᵢ λ 🐂 🎲 : ASCII-COMPATIBLE string")
������ ������������� �� ���� ���� : ASCII-COMPATIBLE string>

But when compiled, it works fine

$ carp -b file.carp
ŋóŧ α∫çᵢᵢ λ 🐂 🎲 : ASCII-COMPATIBLE string

Another misbehavior that I have that may be related to this issue :
When I use the linux64 binary release, I got the warning :

/PathTo/carp: /lib64/libtinfo.so.6: no version information available (required by /PathTo/carp)
Welcome to Carp 0.5.4

even though /lib64/libtinfo.so.6 exists on my system.
stack-build.log
stack-version.log

I've tested this on MacOS (12.0.1) and there it works correctly. So your theory
regarding the libtinfo warning seems plausible. Perhaps some Haskell/Stack expert knows more?

Re: the libtinfo message, the recent issue #1360 is the same. From some cursory research, the problem seems to relate to the way ncurses was built on the end user's system. ncurses gives one the option to build with or without version symbol information. (see https://github.com/mirror/ncurses/blob/master/INSTALL with-versioned-syms option). It seems that some linux distros include ncurses that were not compiled with this option enabled.

For example, there is a known issue with fedora whereby ncurses is not compiled with versioned symbols: see https://bugzilla.redhat.com/show_bug.cgi?id=1875587 and https://stackoverflow.com/questions/63730439/lib64-libtinfo-so-5-no-version-information-available. The issue might be the same on Gentoo. The fix seems to be to build ncurses with versioned symbols enabled and to add this version of the libraries to your library search paths.

The repl issue is strange. When executing IO.print the repl basically does the same thing as calling carp -b file.carp and then executes the binary that results, so the output should be equivalent. This probably means the issue stems from the communication between the repl and the terminal. Again, this might be an ncurses thing, but I don't know for sure. ncurses basically has two libraries, one that supports "wide" unicode characters, and one that doesn't. This can result in issues if e.g. somehow a program is using the library that does not support the extended character sets.

Update :
The warning /PathTo/carp: /lib64/libtinfo.so.6: no version information available (required by /PathTo/carp) from issue #1360 is unrelated to this ASCII #1366 issue.

The warning that appears when using carp's binary from releases is indeed fixed by using a ncurses that was compiled with --with-versioned-syms. #1360

The issue #1366 in carp's repl was triggered by the environment variable LC_ALL (was set to C).

$ export LC_ALL=en_US.UTF-8
$ carp

does solve the issue about ASCII/grapheme.

Fix ideas:
I think we could tell the user to use a LC_ALL that support unicode in the same way we tell the user to export CARP_DIR (in doc + warning in executable when LC_ALL does not support unicode).
Or just change LC_ALL to a default unicode value inside Carp's executable.

@IrrenWirr Great job figuring it all out! I think either of the proposed fixes are fine, but perhaps just documenting it is best (?)

Do you have time to fix this yourself in a PR? Otherwise I'll do it but I'm working on another feature right now so I don't know when I'll get to it.

@eriksvedang I have more than enough spare time to propose a PR. Always glade to help 😊.
But I don't have the knowledge to change LC_ALL to a default unicode value inside Carp's executable.
So my PR will be limited to the documentation. I suppose it's good enough ¯\(ツ)/¯ and it can always be throw away if we choose to fix it without relying on the user's environments variables.
The PR should arrives in less than 2 days 🐌.

Fixed by #1367 .