Can you add some parameters?

Question

Can you add some parameters?

Closed this issue 6 months ago · 3 comments

This package is indeed very good. Recently, it has solved my problem of reading large xlsx, but I hope the author can add more custom parameters, such as

Specify the data type for each column,
Specify the na value
When reading data, do not introduce Scientific notation, especially if there are both text and numbers in a column, text will be selected by default, but numbers will be recognized as scientific counting
There seems to be a coding issue? (test file)

> mm = SheetReader::read_xlsx(path = f2, sheet = 1)
> head(mm$Profit)
[1] "&#26412;&#26399;&#21033;&#28070;" "&#27809;&#26377;&#21333;&#20301;"
[3] NA                                 "1.72925e+08"                     
[5] NA                                 NA

Answer 1 · 2023-06-15T18:04:59.000Z

Thank you,
I have pushed a fix for 4., there was an issue with xml-escaped unicode characters. If you have devtools you can try to install via install_github("fhenz/SheetReader-r"), I will probably only upload a new CRAN version once I have also addressed some of your other points.

I think 1. and 2. are both good ideas, I will try to implement something similar to what readxl also has.
3. is a bit tricky because Excel doesn't differentiate between integer or real numbers when storing, but I should be able to solve this more elegantly if I implement 1. (so it would then be solved by specifiying string/text as the column data type, that should be sufficient?).

Answer 2 · 2023-06-17T14:45:05.000Z

Thank you, I have pushed a fix for 4., there was an issue with xml-escaped unicode characters. If you have devtools you can try to install via install_github("fhenz/SheetReader-r"), I will probably only upload a new CRAN version once I have also addressed some of your other points.

I think 1. and 2. are both good ideas, I will try to implement something similar to what readxl also has. 3. is a bit tricky because Excel doesn't differentiate between integer or real numbers when storing, but I should be able to solve this more elegantly if I implement 1. (so it would then be solved by specifiying string/text as the column data type, that should be sufficient?).

Thank you for your reply. Indeed, if 1 is resolved, then 3 can theoretically be resolved,

Answer 3 · 2024-03-01T19:54:36.000Z

A new parameter col_types has been added that allows specifying the data types for columns via named/unnamed character vector, e.g. read_xlsx([...], col_types=c("Profit"="text")).