daqana/tikzDevice

tikzDevice produces a non-UTF-8 output file on Windows

julianre opened this issue · 2 comments

The TeX/TikZ files generated by tikzDevice are not encoded in UTF-8 but in the Windows 1252 file encoding (under Windows, see Session Info). This results in a problem, if the plot is included in the main LaTeX document (UTF-8 encoding). In my case (MWE), I receive the following error message:

pandoc.exe: Cannot decode byte '\xd6': Data.Text.Internal.Encoding.streamDecodeUtf8With: Invalid UTF-8 stream
Error: pandoc document conversion failed with error 1

Minimal working example (RMarkdown file):

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```

```{r plot, dev="tikz", external=FALSE, echo=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = "ÖÄÜ", ylab = "öäü")
```

I tried the MWE under Ubuntu and the generated file is encoded in UTF-8 (no problem).

xfun::session_info()

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362), RStudio 1.2.1335

Locale:
  LC_COLLATE=German_Germany.1252 
  LC_CTYPE=German_Germany.1252   
  LC_MONETARY=German_Germany.1252
  LC_NUMERIC=C                   
  LC_TIME=German_Germany.1252    

Package version:
  compiler_3.6.1    evaluate_0.14    
  filehash_2.4-2    glue_1.3.1       
  graphics_3.6.1    grDevices_3.6.1  
  grid_3.6.1        highr_0.8        
  knitr_1.24        magrittr_1.5     
  markdown_1.1      methods_3.6.1    
  mime_0.7          png_0.1.7        
  rstudioapi_0.10   stats_3.6.1      
  stringi_1.4.3     stringr_1.4.0    
  tikzDevice_0.12.3 tools_3.6.1      
  utils_3.6.1       xfun_0.9         
  yaml_2.2.0  
rstub commented

Thanks for the report. Besides the workarounds mentioned on SO, you could also try to tell LaTeX to read the file using Windows 1252 encoding instead of the default UTF-8:

options( tikzLatexPackages = c(getOption( "tikzLatexPackages" ),
                               "\\usepackage[cp1252]{inputenc}" ))

(Completely untested since I am not on Windows right now ...)

rstub commented

Maybe related: Encoding problems on Linux at https://stackoverflow.com/q/58401825/8416610