zip path is too long
brianmsm opened this issue · 5 comments
I am in windows and I have a certain folder structure. I have a database that I try to import with readxl::read_excel(
), however I get the following error:
Error in unz(zip_path, file_path, open = "rb") :
cannot open the connection
In addition: Warning message:
In unz(zip_path, file_path, open = "rb") : zip path is too long
I have copied the same file to the same location in .sav and .dta format with the haven package and it reads normally. I have also activated long paths as suggested here (https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=powershell), but it still does not work.
haven::read_sav("1. Data/Valence Depresion Domaradzka.sav")
#> # A tibble: 1,632 × 39
#> Id sex age VD02 VD03 VD04 VD05 VD06 VD07 VD08
#> <dbl> <dbl+lbl> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l>
#> 1 2 1 [Femal… 32 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 2 4 1 [Femal… 34 2 [I d… 1 [I a… 1 [I a… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> 3 10 1 [Femal… 30 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> 4 11 1 [Femal… 23 1 [I a… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> 5 15 1 [Femal… 53 2 [I d… 1 [I a… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> 6 16 1 [Femal… 46 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 7 17 1 [Femal… 51 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d… 2 [I d… 2 [I d…
#> 8 19 1 [Femal… 62 1 [I a… 1 [I a… 2 [I d… 1 [I a… 1 [I a… 2 [I d… 1 [I a…
#> 9 22 1 [Femal… 34 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 10 24 1 [Femal… 43 2 [I d… 1 [I a… 2 [I d… 1 [I a… 1 [I a… 1 [I a… 1 [I a…
#> # … with 1,622 more rows, and 29 more variables: VD09 <dbl+lbl>,
#> # VD10 <dbl+lbl>, VD11 <dbl+lbl>, VD12 <dbl+lbl>, VD14 <dbl+lbl>,
#> # VD15 <dbl+lbl>, VD16 <dbl+lbl>, VD17 <dbl+lbl>, VD18 <dbl+lbl>,
#> # VD19 <dbl+lbl>, VD20 <dbl+lbl>, VD21 <dbl+lbl>, VD22 <dbl+lbl>,
#> # VD23 <dbl+lbl>, VD24 <dbl+lbl>, VD25 <dbl+lbl>, VD26 <dbl+lbl>,
#> # VD27 <dbl+lbl>, VD28 <dbl+lbl>, VD29 <dbl+lbl>, VD30 <dbl+lbl>,
#> # VD31 <dbl+lbl>, VD33 <dbl+lbl>, VD34 <dbl+lbl>, VD35 <dbl+lbl>, …
haven::read_dta("1. Data/Valence depresion Domaradzka.dta")
#> # A tibble: 1,632 × 39
#> Id sex age VD02 VD03 VD04 VD05 VD06 VD07 VD08
#> <dbl> <dbl+lbl> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l>
#> 1 1 2 [Male] 31 1 [I a… 2 [I d… 2 [I d… 1 [I a… 1 [I a… 1 [I a… 1 [I a…
#> 2 2 1 [Femal… 32 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 3 3 2 [Male] 40 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 4 4 1 [Femal… 34 2 [I d… 1 [I a… 1 [I a… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> 5 5 2 [Male] 40 2 [I d… 2 [I d… 1 [I a… 2 [I d… 1 [I a… 2 [I d… 2 [I d…
#> 6 6 2 [Male] 24 2 [I d… 1 [I a… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 7 7 2 [Male] 29 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 8 8 2 [Male] 25 1 [I a… 1 [I a… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a…
#> 9 9 2 [Male] 25 1 [I a… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d… 1 [I a…
#> 10 10 1 [Femal… 30 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> # … with 1,622 more rows, and 29 more variables: VD09 <dbl+lbl>,
#> # VD10 <dbl+lbl>, VD11 <dbl+lbl>, VD12 <dbl+lbl>, VD14 <dbl+lbl>,
#> # VD15 <dbl+lbl>, VD16 <dbl+lbl>, VD17 <dbl+lbl>, VD18 <dbl+lbl>,
#> # VD19 <dbl+lbl>, VD20 <dbl+lbl>, VD21 <dbl+lbl>, VD22 <dbl+lbl>,
#> # VD23 <dbl+lbl>, VD24 <dbl+lbl>, VD25 <dbl+lbl>, VD26 <dbl+lbl>,
#> # VD27 <dbl+lbl>, VD28 <dbl+lbl>, VD29 <dbl+lbl>, VD30 <dbl+lbl>,
#> # VD31 <dbl+lbl>, VD33 <dbl+lbl>, VD34 <dbl+lbl>, VD35 <dbl+lbl>, …
readxl::read_excel("1. Data/Valence depresion Domaradzka.xlsx")
#> Warning in unz(zip_path, file_path, open = "rb"): el path de zip es demasiado
#> largo
#> Error in unz(zip_path, file_path, open = "rb"): no se puede abrir la conexión
fs::path_real("1. Data/Valence depresion Domaradzka.xlsx")
#> D:/Insync/brianmsm@gmail.com/Google Drive/Cursos de Brian Peña - Compartido/Mios/Cursos en la SPP/1. Curso Virtual. Análisis de datos con R para Psicólogos/Materiales/Cuarta Edición/Sesión 01/1. Data/Valence depresion Domaradzka.xlsx
Created on 2023-02-05 with reprex v2.0.2
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.2 (2022-10-31 ucrt)
#> os Windows 10 x64 (build 22621)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate Spanish_Peru.utf8
#> ctype Spanish_Peru.utf8
#> tz America/Bogota
#> date 2023-02-05
#> pandoc 3.0.1 @ C:/Users/brian/AppData/Local/Pandoc/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.2)
#> cli 3.6.0 2023-01-09 [1] CRAN (R 4.2.2)
#> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.2)
#> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.2)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.2)
#> evaluate 0.20 2023-01-17 [1] CRAN (R 4.2.2)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.2)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.2)
#> forcats 1.0.0 2023-01-29 [1] CRAN (R 4.2.2)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.2)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.2)
#> haven 2.5.1 2022-08-22 [1] CRAN (R 4.2.2)
#> hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.2)
#> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.2)
#> knitr 1.42 2023-01-25 [1] CRAN (R 4.2.2)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.2)
#> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.2)
#> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.2.2)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.2)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.2.2)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.2)
#> readr 2.1.3 2022-10-01 [1] CRAN (R 4.2.2)
#> readxl 1.4.1 2022-08-17 [1] CRAN (R 4.2.2)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.2)
#> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.2)
#> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.2)
#> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.2)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.2)
#> styler 1.9.0 2023-01-15 [1] CRAN (R 4.2.2)
#> tibble 3.1.8 2022-07-22 [1] CRAN (R 4.2.2)
#> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.2)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.2)
#> vctrs 0.5.1 2022-11-16 [1] CRAN (R 4.2.2)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.2)
#> xfun 0.36 2022-12-21 [1] CRAN (R 4.2.2)
#> yaml 2.3.6 2022-10-18 [1] CRAN (R 4.2.2)
#>
#> [1] C:/Users/brian/AppData/Local/R/win-library/4.2
#> [2] C:/Program Files/R/R-4.2.2/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
readxl only uses base R facilities in the internal helper where this is coming from:
https://github.com/tidyverse/readxl/blob/main/R/xlsx-zip.R
So the answer for now is that this path truly is problematic for readxl, because there's not some quick fix we can make in our code.
I know you say have activated long paths, but here's someone reporting success with that method, pointing to exactly the same article:
https://stackoverflow.com/a/71621579
Have you definitely restarted your computer since making the change?
It looks like openxlsx uses a 3rd party library to access the files inside the .zip
archive (which is what .xlsx
files actually are), so you may want to try using that package instead.
And another lead re: something to check on your system:
https://community.rstudio.com/t/does-rstudio-use-windows-longpathsenabled-registry-setting/130033
I have by no means digested all of the content in this post, but it gives me hope that perhaps the problem is going to be fixed at the source, i.e. in R itself, in the not-too-distant future:
https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/
I'm sorry, I had not seen the responses in this thread. I made the change in gpedit.msc and restarted also but the problem persists.
It is possible that the next version of R will handle long paths better and solve this for us.