Unexpected additional characters in Unicode output with GHC 9.0.1 / nightly
Closed this issue · 6 comments
I'm seeing some strange Unicode behavior in my Haskell package when it builds under GHC 9.0.1. I understand that solving this may involve checking for changes in other Haskell packages, but my question here is whether the unexpected output I'm seeing rings any Unicode bells (Haskell or otherwise), so that I can begin to track down the reasons for the unexpected output. Perhaps there's a known issue with a dependency that affects Unicode output? Something with the testing package dependencies?
Where I expect to see, respectively
β
(or\946
) andγ
(or\947
)
I instead see
β?KQHTLXOCBJSPDZRAMEWNIUYGV
andγ?EYJVCNIXWPBQMDRTAKZGFUHOS
This output also has some frustrating properties that make it hard to sort out what's going on:
- The garbage letters following the greek character, though always the same on my local machine, are not the same as those I see on builds on other platforms (e.g. on Travis CI Focal I get
β?SOVPZJAYQUIRHXLNFTGKDCMB
) - What I see and what I get when I paste what I see are different. Typicaly the leading and trailing garbage characters are truncated. So I assume the
?
is actually some special character.
Critically, none of this was happening with pre GHC 9 nightly resolvers.
Do the unexpected patterns of characters following the greek characters correspond to anything that would help track down the source of my error? Is there something about how GHC 9 or the packages in the latest nightly Stackage resolvers are handling Unicode that could be causing this?
To replicate:
stack update
stack unpack crypto-enigma-0.1.1.6
cd crypto-enigma-0.1.1.6
rm -f stack.yaml && stack init --resolver nightly
stack build --resolver nightly --haddock --test --bench --no-run-benchmarks
If you look at the snapshot diff, did any of your dependencies get upgraded and look suspicious? https://www.stackage.org/diff/nightly-2021-06-14/nightly-2021-06-20 => a dependency may have changed
Can you reproduce this on GHC 8.10 with the same dependencies as in nightly-2021-06-20? => might be a change in GHC
Can you reproduce it with cabal-install? => might be an issue with stack
@bergmark It doesn't look like dependences have changed, and it seems to work on GHC 8.10 with the same dependences as nightly (I think; still working on it) but it looks like something really weird is going on with the use of a unicode character as a key.
type Name = String
type Wiring = Mapping
type Turnovers = String
data Component = Component {
name :: !Name, -- ^ The component's 'Name'.
wiring :: !Wiring, -- ^ The component's 'Wiring'.
turnovers :: !Turnovers -- ^ The component's 'Turnovers'.
}
-- Definitions of rotor Components; people died for this information
rots_ :: M.Map Name Component
rots_ = M.fromList $ (name &&& id) <$> [
-- rotors
Component "I" "EKMFLGDQVZNTOWYHXUSPAIBRCJ" "Q",
Component "II" "AJDKSIRUXBLHWTMCQGZNPYFVOE" "E",
Component "III" "BDFHJLCPRTXVZNYEIWGAKMUSQO" "V",
Component "IV" "ESOVPZJAYQUIRHXLNFTGKDCMWB" "J",
Component "V" "VZBRGITYUPSDNHLXAWMJQOFECK" "Z",
Component "VI" "JPGVOUMFYQBENHZRDKASXLICTW" "ZM",
Component "VII" "NZJHGRCXMYSWBOUFAIVLPEKQDT" "ZM",
Component "VIII" "FKQHTLXOCBJSPDZRAMEWNIUYGV" "ZM",
Component "β" "LEYJVCNIXWPBQMDRTAKZGFUHOS" "",
Component "γ" "FSOKANUERHMBTIYCWLQPZXVGJD" ""]
and
rotors :: [Name]
rotors = M.keys rots_
and somehow — only since GHC 9 — when the name
for a Component
is a Greek character keys
, rather than returning just the Greek character, also picks up other text. What that text is varies by context. On my local machine, it is always the wiring
for the previous Component
in rots_
(which is more than weird enough!), but on Travis CI β
appends the wring
for IV
and γ
appends just an X
.
If I had to guess, this suggests that there is something going on with respect to how Unicode is actually stored by the compiler that's causing M.keys
applied to a Component
to pick us something nearby that shouldn't actually be part of keys
(or name
).
This one really has me stumped and is way above my Haskel skill level. Any help is much appreciated.
Always a good sign when the issue name contains "Mysterious"!
Always a good sign when the issue name contains "Mysterious"!
Yeah. That's certainly how it's felt here!
Looks like this will be fixed in GHC 9.0.2, we'll upgrade nightly when it arrives