pandoc/dockerfiles

Latex error with emojis

cmahnke opened this issue ยท 4 comments

Latex doesn't seems to be configured probably:

docker run --rm        --volume "$(pwd):/data"        --user $(id -u):$(id -g)        pandoc/latex:2.6 protokoll.md -o outfile.pdf
Error producing PDF.
! Package inputenc Error: Unicode character ๐Ÿ˜‰ (U+1F609)
(inputenc)                not set up for use with LaTeX.

When running with --pdf-engine=xelatex:

[WARNING] Missing character: There is no ๐Ÿ˜‰ in font [lmroman10-regular]:mapping=tex-text;!

Hi @cmahnke !

Emojis are not supported by the pandoc/extra variant.

Adding Emojis to the variant is not a simple task. It would required adding a font family that contains emojis such as Noto, but these kinds of fonts are HUGE and would probably double the size of the docker image.

https://fonts.google.com/noto/specimen/Noto+Emoji

Thanks, I also thought about it and it might be related to #29 ?
How about updating the entrypoint to let it either check if required files are there or otherwise download them, if a parameter is provided?
Example:

docker run  --volume "$(pwd):/data"  --user $(id -u):$(id -g)   pandoc/latex:2.6 --fonts=emoji,cjk protokoll.md -o outfile.pdf

yes this is related to #29. In facts, emojis is just another non-latin charset and could be treated like chineses, persian, russian, etc.

If we were to support emojis, we would probably create a dedicated variant that would also support all the main international charsets... But again this requires a significant amount of work and testing...

Regarding your last question : I think we should keep the entrypoint as simple as possible and avoid coding too much logic inside it. Especially in that case : downloading content on-the-fly is probably not a good idea.

However here's 2 ideas to look for:

  • The tectonic engine latex engine does download required latex packages during the build process. Maybe it could download the emoji package for you. The tectonic is not available in the pandoc/extra variant yet but I think we could have a open discussion about it.

https://tectonic-typesetting.github.io/en-US/
https://ctan.org/pkg/emoji

  • With the EPUB format, pandoc has an option --epub-embed-font where you can declared a font-face with an URL. Obviously this will not work for PDF generation, but maybe it would be possible to add a similar option for latex (--latex-embed-font) in pandoc ?

https://pandoc.org/MANUAL.html#option--epub-embed-font

Closing; please use a custom image to include additional fonts. The README contains an example.