cr_cn fails with some valid DOIs
bobmuscarella opened this issue · 4 comments
rcrossref
is returning errors with some valid DOIs (a sample below). These are valid, as confirmed on doi.org. Any ideas what is going on or how to fix?
Please note that I am using the most recent dev version of rcrossref
and I have added my email to the R.environment as per instruction on the rcrossref
Github page.
Thanks for any help!
Session Info
> library(rcrossref)
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rcrossref_1.1.0.99
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 plyr_1.8.6 compiler_4.0.3 pillar_1.6.1
[5] later_1.2.0 remotes_2.4.0 tools_4.0.3 digest_0.6.27
[9] jsonlite_1.7.2 lifecycle_1.0.0 tibble_3.1.2 pkgconfig_2.0.3
[13] rlang_0.4.11 shiny_1.6.0 DBI_1.1.1 crul_1.1.0
[17] curl_4.3.1 fastmap_1.1.0 xml2_1.3.2 stringr_1.4.0
[21] dplyr_1.0.6 generics_0.1.0 vctrs_0.3.8 htmlwidgets_1.5.3
[25] DT_0.18 tidyselect_1.1.1 glue_1.4.2 httpcode_0.3.0
[29] R6_2.5.1 fansi_0.5.0 purrr_0.3.4 magrittr_2.0.1
[33] promises_1.2.0.1 ellipsis_0.3.2 htmltools_0.5.1.1 assertthat_0.2.1
[37] mime_0.10 xtable_1.8-4 httpuv_1.6.1 utf8_1.2.1
[41] stringi_1.6.2 miniUI_0.1.1.1 crayon_1.4.1
> cr_cn("10.1111/ddi.13378", "text")
Error in nchar(hh) : invalid multibyte string, element 1
> cr_cn("10.1111/btp.12905", "text")
Error in nchar(hh) : invalid multibyte string, element 1
> cr_cn('10.1038/s41597-020-00788-5')
Error in nchar(hh) : invalid multibyte string, element 1
> cr_cn("10.1111/geb.13346")
Warning message:
v1/works/10.1111/geb.13346/transform w/ (500) -
Thank you for raising this issue @bobmuscarella It seems Crossref API does not encode responses to UTF-8. I will alert Crossref about it. The issue relates to #221
Asked Crossref team about header encoding: https://gitlab.com/crossref/issues/-/issues/1574
Hi @doomlab Crossref has not fixed the issue yet, but there has been an update on how crul, rcrossref's underlying http client, deals with header encodings (see here). Good news, if you update to the most recent crul version on CRAN (1.2.0), at least the first three examples work; cr_cn("10.1111/geb.13346")
returns an internal server error.
library(rcrossref)
cr_cn("10.1111/ddi.13378", "text")
#> [1] "Pouteau, R., Biurrun, I., Brunel, C., Chytrý, M., Dawson, W., Essl, F., Fristoe, T., Haveman, R., Hobohm, C., Jansen, F., Kreft, H., Lenoir, J., Lenzner, B., Meyer, C., Moeslund, J. E., Pergl, J., Pyšek, P., Svenning, J., Thuiller, W., … van Kleunen, M. (2021). Potential alien ranges of European plants will shrink in the future, but less so for already naturalized than for not yet naturalized species. Diversity and Distributions, 27(11), 2063–2076. Portico. https://doi.org/10.1111/ddi.13378"
cr_cn("10.1111/btp.12905", "text")
#> [1] "Rech, A. R., Ollerton, J., Dalsgaard, B., Ré Jorge, L., Sandel, B., Svenning, J., Baronio, G. J., & Sazima, M. (2021). Population‐level plant pollination mode is influenced by Quaternary climate and pollinators. Biotropica, 53(2), 632–642. Portico. https://doi.org/10.1111/btp.12905"
cr_cn('10.1038/s41597-020-00788-5')
#> [1] "@article{Lundgren_2021,\n\tdoi = {10.1038/s41597-020-00788-5},\n\turl = {https://doi.org/10.1038%2Fs41597-020-00788-5},\n\tyear = 2021,\n\tmonth = {jan},\n\tpublisher = {Springer Science and Business Media {LLC}},\n\tvolume = {8},\n\tnumber = {1},\n\tauthor = {Erick J. Lundgren and Simon D. Schowanek and John Rowan and Owen Middleton and Rasmus {\\O}. Pedersen and Arian D. Wallach and Daniel Ramp and Matt Davis and Christopher J. Sandom and Jens-Christian Svenning},\n\ttitle = {Functional traits of the world's late Quaternary large-bodied avian and mammalian herbivores},\n\tjournal = {Scientific Data}\n}"
cr_cn('10.1111/geb.13346')
#> Warning: v1/works/10.1111/geb.13346/transform w/ (500) -
Created on 2022-02-20 by the reprex package (v2.0.0)
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.1.2 (2021-11-01)
#> os macOS Big Sur 11.4
#> system aarch64, darwin20
#> ui X11
#> language en
#> collate de_DE.UTF-8
#> ctype de_DE.UTF-8
#> tz Europe/Copenhagen
#> date 2022-02-20
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0)
#> backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0)
#> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.1)
#> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.1)
#> crul 1.2.0 2021-11-22 [1] CRAN (R 4.1.1)
#> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0)
#> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0)
#> digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.1)
#> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.0)
#> DT 0.19 2021-09-02 [1] CRAN (R 4.1.1)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0)
#> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0)
#> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.1)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1)
#> htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.1.1)
#> httpcode 0.3.0 2020-04-10 [1] CRAN (R 4.1.0)
#> httpuv 1.6.3 2021-09-09 [1] CRAN (R 4.1.1)
#> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0)
#> knitr 1.37 2021-12-16 [1] CRAN (R 4.1.1)
#> later 1.3.0 2021-08-18 [1] CRAN (R 4.1.1)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0)
#> mime 0.12 2021-09-28 [1] CRAN (R 4.1.1)
#> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.1.0)
#> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0)
#> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0)
#> promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.1.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1)
#> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0)
#> rcrossref * 1.1.0.99 2021-10-16 [1] Github (ropensci/rcrossref@319f34c)
#> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.1.0)
#> rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0)
#> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.1)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0)
#> shiny 1.7.1 2021-10-02 [1] CRAN (R 4.1.1)
#> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.1)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0)
#> styler 1.5.1 2021-07-13 [1] CRAN (R 4.1.0)
#> tibble 3.1.5 2021-09-30 [1] CRAN (R 4.1.1)
#> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0)
#> triebeard 0.3.0 2016-08-04 [1] CRAN (R 4.1.0)
#> urltools 1.7.3 2019-04-14 [1] CRAN (R 4.1.0)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0)
#> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0)
#> withr 2.4.3 2021-11-30 [1] CRAN (R 4.1.1)
#> xfun 0.29 2021-12-14 [1] CRAN (R 4.1.1)
#> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0)
#> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.1.0)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library