fkeck/refdb

Tibble error when trying to download Chordate COI database from NCBI

Closed this issue · 2 comments

Hi François,

Thanks for the great code to download NCBI data into R. I have managed to download a large amount of COI NCBI data already for arthoprods and other taxa. Unfortunately I encounter an error when trying to download Chordate COI data.

Below is my code:

chordate_df_refdb<-refdb_import_NCBI("(cytochrome c oxidase subunit 1[Title] OR cytochrome c oxidase subunit I[Title] OR cytochrome oxidase subunit 1[Title] OR cytochrome oxidase subunit I[Title] OR COX1[Title] OR CO1[Title] OR COI[Title] AND mitochondrion[Filter] NOT environmental sample[Title] NOT environmental samples[Title] NOT environmental[Title] NOT uncultured[Title]) NOT unclassified[Title] NOT unidentified[Title] NOT unverified[Title]) AND (Chordata[organism])")

Here is the error message:
`Downloading 420377 sequences from NCBI...

81000 (19.3%) sequences downloaded.
Something went wrong:
HTTP failure: 500
{"error":"error forwarding request","api-key":"2a04:6ec0:20d:c5a0:1cae:305c:821e:6390","type":"ip",
"status":"ok"}
Retrying in 0 s.
Error:
! Tibble columns must have compatible sizes.
• Size 200: Existing data.
• Size 204: Column id.
ℹ Only values of size one are recycled.
`

It seems that for some reason when downloading a specific NCBI entry, it is creating columns of different lengths then the rest of the data. Would you by any chance be able to fix this bug?

Many thanks!!

Mark

fkeck commented

Hi Mark,
I am looking at your issue, but this kind of problem are most likely on the side of NCBI and are hard to investigate.
I will keep you informed of my progress here.
François

fkeck commented

Hi Mark,
I just tested with your command and it worked after the patch 8e7d896
Please use the GitHub version of the package, as no CRAN release is planned soon.
François