cdepillabout/pretty-simple

pPrint prints unexpected characters for unprintable string contents

expipiplus1 opened this issue · 3 comments

For example

pPrint ("\DC1\205N\237\232s\225\232N\147K\173\RSE\201\EM" :: ByteString)
"ÍNíèsáèN�K­EÉ"

I would expect pPrint (x :: Bytestring) to be x

@expipiplus1 I think that some escaped characters in a string get replaced with the literal character.

This is good for things like \n, where you want to see the literal newline, but seems to be not as good for stuff like this, where you ideally want the escape sequences.

When I tried this in the terminal with the latest version of pretty-simple (3.2.2.0), I got the following:

> import Text.Pretty.Simple
> import Data.ByteString
> :set -XOverloadedStrings
> pPrint ("\DC1\205N\237\232s\225\232N\147K\173\RSE\201\EM" :: ByteString)
"\x11ÍNíèsáèN\x93K\xad\x1eEÉ\x19"

This seems to me like it is roughly doing the correct thing here. It appears to be printing all printable characters, and using escape sequences for non-printable characters.


I'm going to close this, since it appears that pretty-simple is working correctly here, but if you want to try to argue that it is not actually working correctly, or there is a better way for it to work, it is possible I could be convinced. Feel free to comment below or send a PoC PR.

3.2.2.0 does look better than what I'm observing with 2.2.0.1 (didn't realise I was so far behind!)

My issue is that in my case the ByteString is just some binary blob data and \n means nothing special in it. (Ideally I'd show this as some base64 encoded string, but then the Read instance needs changing too, and it is just a little surprising).

More generally though, the current behaviour disallows a valid workflow of

  • pPrint some data
  • copy it into a Haskell program
  • compile the program

Currently the programmer has to manually undo any newline's after copying.

Love the library in general BTW :)

@expipiplus1 Ah, I think I can understand what you're asking here.

Unfortunately, pretty-simple doesn't currently have any options for what to do with unprintable characters and control characters.

I'm not sure exactly how this should be handled, but if you had some good ideas for how to handle this, please feel free to open a new issue.

If you're interested, maybe you could add an option about whether or not to show whitespace like \t and \n?

Also, there is an option called outputOptionsEscapeNonPrintable:

> pPrintOpt CheckColorTty (defaultOutputOptionsDarkBg { outputOptionsEscapeNonPrintable = False })  ("\DC1\205N\237\232s\225\232N\147K\173\RSE\201\EM" :: ByteString)
"ÍNíèsáèNK­EÉ"
> pPrintOpt CheckColorTty (defaultOutputOptionsDarkBg { outputOptionsEscapeNonPrintable = True })  ("\DC1\205N\237\232s\225\232N\147K\173\RSE\201\EM" :: ByteString)
"\x11ÍNíèsáèN\x93K\xad\x1eEÉ\x19"

Although, things like \n and \t are always printable, so they are never escaped.