Non-descriptive error
Closed this issue · 4 comments
myhub = AnnotationHub()
snapshotDate(): 2021-05-18
getInfoOnIds(myhub, "AH72154")
myhub_id fetch_id title rdataclass status biocversion rdatadateadded rdatadateremoved
288111 AH72154 78900 org.Salmo_salar.eg.sqlite OrgDb Public 3.9 2019-05-02 NA
file_size
288111 161341440
myhub[["AH72154"]]
Error: Public
Hiya, the db is present as can be seen above, but I'm not sure what this error message means?
sorry. yes I need to improve the error warnings for org packages. orgDb packages are updated per release so likely the orgDb that you wish to access is too old for your version of R/Bioconductor. which if we query for your species, indeed there are more recent versions with more accurate information
> query(myhub, "org.Salmo")
AnnotationHub with 9 records
# snapshotDate(): 2021-09-23
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Salmo tshawytscha, Salmo trutta, Salmo salar, Salmo nerka, Salmo...
# $rdataclass: OrgDb
# additional mcols(): taxonomyid, genome, description,
# coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
# rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH93861"]]'
title
AH93861 | org.Salmo_mykiss.eg.sqlite
AH93874 | org.Salmo_kisatch.eg.sqlite
AH93875 | org.Salmo_trutta.eg.sqlite
AH93881 | org.Salmo_salar.eg.sqlite
AH93888 | org.Salmo_tshawytscha.eg.sqlite
AH93896 | org.Salmo_namaycush.eg.sqlite
AH93905 | org.Salmo_alpinus.eg.sqlite
AH93910 | org.Salmo_nerka.eg.sqlite
AH93913 | org.Salmo_keta.eg.sqlite
Can I piggyback off of this issue?
I am currently working with Atlantic salmon, and I did some functional analysis last year in February based off of the OrgDb record that was available at the time.
I am repeating the analysis now with a different record (AH111638), and I am getting very different results in terms of number of GO terms picked up in Over Representation Analysis.
Am I able to see if this more recent record has replaced the old one? I do not remember the reference for the old one, nor did I write it down anywhere since I used to create the object by doing sasa <- query(ah, c('OrgDb', 'Salmo salar'))[[1]]
.
We replace OrgDbs every release to have updated information. OrgDbs are closely associated with the Bioconductor release version and R version. You can tell the date of the added resource by the rdatadateadded
in the query information
> query(ah, c('OrgDb', 'Salmo salar'))
AnnotationHub with 1 record
# snapshotDate(): 2023-10-05
# names(): AH111638
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Salmo salar
# $rdataclass: OrgDb
# $rdatadateadded: 2023-04-24
# $title: org.Salmo_salar.eg.sqlite
# $description: NCBI gene ID based annotations about Salmo salar
# $taxonomyid: 8030
# $genome: NCBI genomes
# $sourcetype: NCBI/UniProt
# $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.uniprot.org/p...
# $sourcesize: NA
# $tags: c("NCBI", "Gene", "Annotation")
# retrieve record with 'object[["AH111638"]]'
To replicate the analysis you would have to use the same version of R and Bioconductor used at the time. Likely Bioconductor 3.16
> temp = ah[["AH107424"]]
Error: AH107424 is an OrgDb resource.
orgDb resources are generated for specific biocversions.
Requested resource works with biocversion: 3.16
To find a resource appropriate for your biocversion try the following query:
query(ah,'org.Salmo_salar.eg.sqlite')
As you can see the ERROR message for the OrgDb has also been updated to be more descriptive and what version would likely be appropriate to be able to replicate the findings.