🧱 Helper to extract final spectra as matrix
Opened this issue · 4 comments
as discussed #27 , now in a separate issue.
Thanks for the dedicated issue.
In terms of user interface, I think this option should essentially be a switch between a list
(with full metadata, which is currently returned to the user) and a matrix
.
The rationale is:
- a chemometrician wanting to pull spectra to either calibrate a model or generate predictions would call
read_opus
withdata_only = TRUE
so a matrix is returned and can be pre-processed directly. - a more advanced user might turn
data_only
toFALSE
for more advanced checks on the data, and a more thorough look at the metadata -- and could of course use alapply
call on that list to combine whatever piece of the data is interesting.
Finally, I'd suggest to change the argument name, and use a simpler and more common matrix = TRUE|FALSE
.
Example call:
# Select all OPUS files from a range of folders
opus_fns <- list.files("some/project/folder/", pattern = glob2rx("my_project-*.0"), full.names = TRUE)
# Directly read and assemble MIR matrix from the selected files
mir_mat <- read_opus(
opus_fns,
matrix = TRUE,
progress = TRUE
)
# Quick plot
matplot(t(mir_mat))
Thanks for the dedicated issue.
In terms of user interface, I think this option should essentially be a switch between a
list
(with full metadata, which is currently returned to the user) and amatrix
.The rationale is:
* a chemometrician wanting to pull spectra to either calibrate a model or generate predictions would call `read_opus` with `data_only = TRUE` so a matrix is returned and can be pre-processed directly. * a **more advanced user** might turn `data_only` to `FALSE` for more advanced checks on the data, and a more thorough look at the metadata -- and could of course use a `lapply` call on that list to combine whatever piece of the data is interesting.
Finally, I'd suggest to change the argument name, and use a simpler and more common
matrix = TRUE|FALSE
.Example call:
# Select all OPUS files from a range of folders opus_fns <- list.files("some/project/folder/", pattern = glob2rx("my_project-*.0"), full.names = TRUE) # Directly read and assemble MIR matrix from the selected files mir_mat <- read_opus( opus_fns, matrix = TRUE, progress = TRUE ) # Quick plot matplot(t(mir_mat))
Thanks for the nice summaries of use cases. I would not rename to matrix
, because it hides the intent what the function does. Yes, it does return a matrix, but it is not clear that this switch is for all parameters and data vs. final spectra only. @pierreroudier @ThomasKnecht I would suggest to either make matrix_spectra
or matrix_spec
in case we moved away from data_only
Because it is quite an important argument for controlling read_opus()
behavior and user experience, these two steps seem important:
- make it very clear in the argument name what the output is ( @pierreroudier above), and that only (final) spectra are returned vs. all data and parameters.
- document this argument really concisely.
currently it is possible to only parse the data. the combination to a matrix should in my opinion be made in an extra function.