InChIKeys give errors
meier-rene opened this issue · 6 comments
I tried to batch process a bigger number of InChI-keys and found some which give errors
bad_key <-
c('QEVGZEDELICMKH-UHFFFAOYSA-N',
'SYLAFCZSYRXBJF-UHFFFAOYSA-N',
'BOPPPUCSDSHZEZ-UHFFFAOYSA-N')
> get_classification(bad_key[1])
✔ QEVGZEDELICMKH-UHFFFAOYSA-N
Error: Columns `source`, `source_id`, `annotations` must be 1d atomic vectors or lists
Call `rlang::last_error()` to see a backtrace.
> get_classification(bad_key[2])
✔ SYLAFCZSYRXBJF-UHFFFAOYSA-N
Error: Columns `source`, `source_id`, `annotations` must be 1d atomic vectors or lists
Call `rlang::last_error()` to see a backtrace.
> get_classification(bad_key[3])
✔ BOPPPUCSDSHZEZ-UHFFFAOYSA-N
Error: Columns `source`, `source_id`, `annotations` must be 1d atomic vectors or lists
Call `rlang::last_error()` to see a backtrace.
Using web browser is working fine:
http://classyfire.wishartlab.com/entities/QEVGZEDELICMKH-UHFFFAOYSA-N
http://classyfire.wishartlab.com/entities/SYLAFCZSYRXBJF-UHFFFAOYSA-N
http://classyfire.wishartlab.com/entities/BOPPPUCSDSHZEZ-UHFFFAOYSA-N
Could you please have a look?
and here comes the stacktrace as suggested by @sneumann
rlang::last_trace()
<error/rlang_error>
Columns `source`, `source_id`, `annotations` must be 1d atomic vectors or lists
Backtrace:
█
1. └─base::sapply(inchikeys, get_classification)
2. └─base::lapply(X = X, FUN = FUN, ...)
3. └─classyfireR:::FUN(X[[i]], ...)
4. └─classyfireR:::parse_external_desc(json_res)
5. └─tibble::tibble(...)
6. ├─tibble::as_tibble(lst_quos(xs, expand = TRUE))
7. └─tibble:::as_tibble.list(lst_quos(xs, expand = TRUE))
8. └─tibble:::list_to_tibble(x, validate)
9. └─tibble:::check_tibble(x)
10. └─tibble:::invalid_df(...)
11. └─tibble:::stopc(...)
Working InChI-keys are for example:
good_key <-
c('JIVPVXMEBJLZRO-UHFFFAOYSA-N',
'ZZUFCTLCJUWOSV-UHFFFAOYSA-N',
'QZTKDVCDBIDYMD-UHFFFAOYSA-N')
This is the pure JSON: QEVGZEDELICMKH-UHFFFAOYSA-N.json.txt
returned for one of them. The error is thrown in
Line 74 in bd6a217
I am on a current snapshot of R-devel, and get a slightly different error message
from certainly the same underlying issue. Checking the JSON I see that there are no source
, source_id
nor annotations
.
> get_classification(bad_key[1])
✔ QEVGZEDELICMKH-UHFFFAOYSA-N
Error: All columns in a tibble must be 1d or 2d objects:
* Column `source` is NULL
* Column `source_id` is NULL
* Column `annotations` is NULL
Call `rlang::last_error()` to see a backtrace
Doing things manually
response <- httr::GET("http://classyfire.wishartlab.com/entities/QEVGZEDELICMKH-UHFFFAOYSA-N.json")
text_content <- httr::content(response, 'text')
json_res <- jsonlite::fromJSON(text_content)
classification <- classyfireR:::parse_json_output(json_res)
I get
> classification
# A tibble: 4 x 3
Level Classification CHEMONT
<chr> <chr> <chr>
1 kingdom Organic compounds CHEMONTID:0000000
2 superclass Organic acids and derivatives CHEMONTID:0000264
3 class Carboxylic acids and derivatives CHEMONTID:0000265
4 subclass Dicarboxylic acids and derivatives CHEMONTID:0000346
Checking one of the working InChIkeys: http://classyfire.wishartlab.com/entities/JIVPVXMEBJLZRO-UHFFFAOYSA-N.json
they do have
...
"external_descriptors":[
{"source":"CHEBI",
"source_id":"CHEBI:3654",
"annotations":["sulfonamide","monochlorobenzenes","isoindoles"]
}
]
...
while the bad keys have "external_descriptors":[]
.
So, in
classyfireR/R/get_classification.R
Line 71 in bd6a217
we need a check for
length(json_res$external_descriptors)>0
Yours, Steffen
Yeah, this is caused when there are no external descriptors are present. I will add a length
check in and push a new version to the devel
branch
Tom
This is fixed now on the devel branch, if you install using;
remotes::install_github('aberHRML/classyfireR', ref = 'devel')
I will add length checks for all the other components, in-case there are further InChIKeys missing elements of the json output. Should be able to get the fixed version onto CRAN by Monday.
Thanks
Tom
Thanks for the fast fix. Its working now.