rstudio/gt

Using `fmt_markdown()` on a factor prints the numeric factor level when the output format is HTML

rossellhayes opened this issue · 2 comments

Prework

Description

When fmt_markdown() is used on a column that contains factor data, it prints the numeric levels of the factor and not their text labels. This only happens when outputting to HTML, and not with other output formats like latex. I believe this is because the md() function, which is used for HTML output, replaces the class of the vector, meaning a factor would lose its factor class and be assigned a new class over its underlying integer type.

I believe this could be fixed by having md() first convert its input to character, then reclass it. I'd be happy to submit a PR if it would be helpful.

Reproducible example

library(gt)

gt(data.frame(x = factor("TEST")))
x
TEST
gt(data.frame(x = factor("TEST"))) |> 
    fmt_markdown()
x
1
gt(data.frame(x = factor("TEST"))) |> 
    fmt_markdown() |> 
    as_latex() |> 
    as.character() |> 
    cat()
#> \begingroup
#> \fontsize{12.0pt}{14.4pt}\selectfont
#> \begin{longtable}{c}
#> \toprule
#> x \\ 
#> \midrule\addlinespace[2.5pt]
#> TEST \\ 
#> \bottomrule
#> \end{longtable}
#> \endgroup

Created on 2024-09-17 with reprex v2.1.0

Expected result

The HTML output should have included the character label for x ("TEST") rather than its numeric level (1). The latex output exhibits the expected behavior.

Session info

sessionInfo()
#> R version 4.4.0 (2024-04-24)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Sonoma 14.6.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/New_York
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] gt_0.11.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.5       cli_3.6.3.9000    knitr_1.46        rlang_1.1.4.9000 
#>  [5] xfun_0.43         generics_0.1.3    glue_1.7.0        htmltools_0.5.8.1
#>  [9] fansi_1.0.6       rmarkdown_2.26    evaluate_0.23     tibble_3.2.1     
#> [13] fastmap_1.1.1     yaml_2.3.10       lifecycle_1.0.4   compiler_4.4.0   
#> [17] dplyr_1.1.4       fs_1.6.4          pkgconfig_2.0.3   rstudioapi_0.16.0
#> [21] digest_0.6.35     R6_2.5.1          reprex_2.1.0      tidyselect_1.2.1 
#> [25] utf8_1.2.4        pillar_1.9.0      magrittr_2.0.3    tools_4.4.0      
#> [29] withr_3.0.1       xml2_1.3.6

Created on 2024-09-17 with reprex v2.1.0

Hi! Thanks for the report! A PR would be welcome.

Adding

  x <- as.character(x)

before

process_text(md(x), context = "html")

seems reasonable

Thanks @olivroy. I've opened #1883.