Double quotation marks don't render properly
Closed this issue · 7 comments
MVE:
parseDoc (ParseOptions NoSourcePos) "Hello \"Djot\" World" <&> renderHtml (RenderOptions False)
Output:
Right "<p>Hello \226\128\156Djot\226\128\157 World</p>\n"
Interestingly if I run djot
with cabal run
it works fine.
~/projects/djoths > cabal run
Hello "Djot"
<p>Hello “Djot”</p>
I don't think this is a problem with the input string, as the AST looks fine:
λ: parseDoc (ParseOptions NoSourcePos) "Hello \"Djot\" World"
Right (Doc {docBlocks = Many {unMany = fromList [Node NoPos (Attr []) (Para (Many {unMany = fromList [Node NoPos (Attr []) (Str "Hello "),Node NoPos (Attr []) (Quoted DoubleQuotes (Many {unMany = fromList [Node NoPos (Attr []) (Str "Djot")]})),Node NoPos (Attr []) (Str " World")]}))]}, docFootnotes = NoteMap {unNoteMap = fromList []}, docReferences = ReferenceMap {unReferenceMap = fromList []}, docAutoReferences = ReferenceMap {unReferenceMap = fromList []}, docAutoIdentifiers = fromList []})
(You can see that Quoted DoubleQuotes (...)
in there)
I'm not clear why one works and the other doesn't though, given that app/Main.hs
seems to do basically the same thing.
Why do you say it isn't working? It is producing a bytestring with the UTF-8 encoding of the curly quotes.
Same for an em-dash:
λ: parseDoc (ParseOptions NoSourcePos) "Hello — world" <&> renderHtml (RenderOptions False)
Right "<p>Hello \DC4 world</p>\n"
~/projects/djoths > echo "Hello — world" | cabal run
<p>Hello — world</p>
Ah ok, so I say "isn't working" because it comes out weirdly in my browser which is maybe my error in understanding the usage of renderHtml
.
I suppose the problem is that hPutBuilder
which is what is used in Main.hs
automatically handles the escape sequences and converts them to UTF8, but it's not obvious that one has to do that when using renderHtml
.
Anyway, I guess this is really a problem of how I handle the output, so I'll close the issue.
renderHtml
gives you a Builder, which you can convert to a lazy bytestring using toLazyByteString
from ByteString.Builder.
If you need a Text or a String, you'd need to call another function to convert from this lazy bytestring.
Thanks -- was less a problem of that and more of handling the unicode escape sequences. I realised I misunderstood how this was supposed to work, and it turns out the fix was ... incredibly basic (adding a charset to the HTTP headers made the browser render it ok).