SomaLogic/SomaDataIO

Analyte metadata stripped with read_adat() due to warning messages returned as non-English statements

m-gall opened this issue · 3 comments

Hi Stu and co,

Opening this issue, not necessarily because it needs a solution, but in case others encounter a similar problem.

I have several customers in China that have experienced issues with the read_adat() function in library(SomaDataIO).

When importing an adat file, analyte metadata values are returned as NA. As an example:

output_analyteinfo

The code they would run to produce this output:

library(SomaDataIO)

f <- system.file("extdata", "example_data10.adat",
                 package = "SomaDataIO", mustWork = TRUE)
df <- read_adat(f)

AnalyteInfo <- getAnalyteInfo(df)
View(AnalyteInfo)

## an example of the output would be the screenshot above

I've done some debugging and I think I have found the cause of the issue.

In line 26 of the script convertColMeta.R:

  • convert_lgl function inside this script is conditional on the warning message returned by as.numeric(.x)
  • It expects the warning message to be returned in English characters:
## line 26 in script convertColMeta.R

na_warn <- is_warn && identical(w$message, "NAs introduced by coercion")`

For some machines, the warning message returned is in chinese characters, consequently, the warning message is not identical, and code is incorrectly predicting the column type.

Here is a screenshot of the warning message returned on a machine that is producing the incorrect output:

Screenshot 2024-04-30 at 16 02 41

Solution:

Setting the session language to english solves the issue:

Sys.setenv(LANGUAGE = "en")

Whilst this is 'a' solution, it would be nice to consider whether the function can be re-factored so that it does not require the user to have their R environment set to english.

Thanks :)

Thank you, Mailie! This is an interesting point and definitely something to consider for international users of SomaScan. I'm not sure if there is a quick/easy solution to this problem, outside of the one you've already identified, but we will investigate internally and update this GitHub issue in the future if we can find better method.

Hi @m-gall, I have a small update for this issue: at this time, English is the only fully supported language for SomaDataIO (which is also the case for SomaLogic's other open-source software tools). This is due to a variety of factors, including developer availability; we don't have the resources to support multiple languages at this time, unfortunately. However, if customers continue to report this bug to you, please let us know! We can re-evaluate if this becomes a frequently reoccurring problem. In the meantime, I hope SomaScan customers will find the workaround you suggested in this issue satisfactory.

Thanks for your consideration @amanda-hi ! Very understandable around resourcing. Will let you know if this becomes a recurrent issue.