Changing how Haskell `show` and `print` escape Unicode characters

The show and print functions (and the outputs in GHCi) replace all non-ASCII characters¹ with escape codes. For example:

ghci> '÷'
'\247'

This is useful in some cases: perhaps you are printing user-entered text into a logfile, and want to make sure it is free of aberrations like Zalgo text. But what if you have strings or objects containing accented characters, symbols, emoji or non-Latin scripts that you want to print verbatim?

This module provides convenience functions for replacing show and print so that these characters are not escaped.

Comments?

Please feel free to submit issues or pull requests with criticism, alternative proposals, suggestions for name changes or any other comments.

Choosing which function to use

You have three choices to make:

Which characters should be escaped?

All non-ASCII characters (the same as show and print)
Only unprintable characters

What format should be used for escape codes?

Decimal (like "\247")
Hexadecimal (like "\xf7")

Which function are you replacing?

print: The function used to output results in GHCi
show: The function that converts objects from a variety of types to the String type
Other functions: If there is a function that behaves like show, or is implemented usingshow, you can change its escape characters by applying one of the functions listed as "conversion functions" to its output.

Depending on the choices above, select the function to use from the following table:

Escape code format?	Decimal	Hexadecimal	Decimal	Hexadecimal
Replacement for `print`	`print`	`printWithHex`	`printUnescapePrintable`	`printUnescapePrintableWithHex`
Replacement for `show`	`show`	`showWithHex`	`showUnescapePrintable`	`showUnescapePrintableWithHex`
Conversion functions	`id`	`withHex`	`unescapePrintable`	`unescapePrintablewithHex`

Of course, the functions in the leftmost column are the standard functions, and id indicates that no conversion is necessary because show already does what you want.

Implementation

The conversion functions are functions of type String -> String that simply parse their input to find escape codes and replace them if necessary. (Escape codes for ASCII characters, such as '\NUL' and '\n', are left unchanged.)

The other functions are defined using the conversion functions, for example:

showUnescapePrintable = unescapePrintable . show

printUnescapePrintable = putStrLn . unescapePrintable . show

Compatibility with `Read`

The output of all the functions should be compatible with read. That is, in the case of unescapePrintable, for example, for all a:

read (showUnescapePrintable a) == read (show a)
read (unescapePrintable a) == read a

Replacing `print`

Functions whose name starts with print are intended for use with the interactive-print option of GHCi, as explained in the GHC User's Guide. For example, you can use the following command-line option to set the print function:

ghci -interactive-print=printUnescapePrintableWithHex Text.Show.Unescaped

You can also set this option by adding the following line to your .ghci configuration file:

import Text.Show.Unescaped
:set -interactive-print=printUnescapePrintableWithHex

Replacing `show`

Functions whose name starts with show are replacements for the show function. You can use them directly, or you can choose one to replace show with, for example by adding the following to the import section of a source file:

import Text.Show.Unescaped (showUnescapePrintableWithHex)
import Prelude hiding (show)
import qualified Prelude as P (show)

and then add the following line to the body of the file:

show = showUnescapePrintableWithHex

This replaces show but keeps the original show function available, renamed as P.show. To derive or implement the Show class for your own objects, you will need to use P.show.

Note on terminology: What are ASCII and non-ASCII characters?

In this explanation, "ASCII characters" refer to what the official Unicode terminology calls the Unicode Basic Latin Block, which consists of the first 128 Unicode characters (officially called "code points"), since these characters are identical to the ASCII character set. "Non-ASCII characters" means all characters outside this block. ↩

xogcox/show-unescaped

Changing how Haskell show and print escape Unicode characters

Comments?

Choosing which function to use

Implementation

Compatibility with Read

Replacing print

Replacing show

Note on terminology: What are ASCII and non-ASCII characters?

Footnotes

Changing how Haskell `show` and `print` escape Unicode characters

Compatibility with `Read`

Replacing `print`

Replacing `show`