orome/crypto-enigma-hs

Unexpected additional characters in Unicode output with GHC 9.0.1 / nightly

Opened this issue · 3 comments

orome commented

Strange output where greek characters are expected:

Where I expect to see, respectively

  • β (or \946) and
  • γ (or \947)

I instead see

  • β?KQHTLXOCBJSPDZRAMEWNIUYGV and
  • γ?EYJVCNIXWPBQMDRTAKZGFUHOS

See

orome commented

It looks like something really weird is going on with the use of a Unicode character as a key.

I have

type Name = String
type Wiring = Mapping
type Turnovers = String

data Component = Component {
        name :: !Name,              -- ^ The component's 'Name'.
        wiring :: !Wiring,          -- ^ The component's 'Wiring'.
        turnovers :: !Turnovers     -- ^ The component's 'Turnovers'.
}

-- Definitions of rotor Components; people died for this information
rots_ :: M.Map Name Component
rots_ = M.fromList $ (name &&& id) <$> [
        -- rotors
        Component "I"    "EKMFLGDQVZNTOWYHXUSPAIBRCJ" "Q",
        Component "II"   "AJDKSIRUXBLHWTMCQGZNPYFVOE" "E",
        Component "III"  "BDFHJLCPRTXVZNYEIWGAKMUSQO" "V",
        Component "IV"   "ESOVPZJAYQUIRHXLNFTGKDCMWB" "J",
        Component "V"    "VZBRGITYUPSDNHLXAWMJQOFECK" "Z",
        Component "VI"   "JPGVOUMFYQBENHZRDKASXLICTW" "ZM",
        Component "VII"  "NZJHGRCXMYSWBOUFAIVLPEKQDT" "ZM",
        Component "VIII" "FKQHTLXOCBJSPDZRAMEWNIUYGV" "ZM",
        Component "β"    "LEYJVCNIXWPBQMDRTAKZGFUHOS" "",
        Component "γ"    "FSOKANUERHMBTIYCWLQPZXVGJD" ""]

and

rotors :: [Name]
rotors = M.keys rots_

and somehow — only since GHC 9 — when the name for a Component is a Greek character keys, rather than returning just the Greek character, also picks up other text. What that text is varies by context. On my local machine, it is always the wiring for the previous Component in rots_ (which is more than weird enough!), but on Travis CI β appends the wring for IV and γ appends just an X.

If I had to guess, this suggests that there is something going on with respect to how Unicode is actually stored by the compiler that's causing M.keys applied to a Component to pick us something nearby that shouldn't actually be part of keys (or name).

This one really has me stumped and is way above my Haskel skill level. Any help is much appreciated.

orome commented

Suspect issue has now been merged into 9.0.2. Hopefully this will fix it!