ft_get() and ft_abstract() don't seem to work with scopus, even with a key
DomRoche opened this issue · 6 comments
Hi,
Unsure if anyone is maintaining fulltext since Scott Chamberlain moved on... but just in case:
I'm trying to find all papers that mention the word "ebird" using fulltext.
I got an API key from Elsevier and added it as an environment variable:
Sys.setenv(ELSEVIER_SCOPUS_KEY = "keynumber")
Sys.setenv(ELSEVIER_TDM_KEY = "keynumber")
I get 2623 hits in scopus using ft_search (more than when using any other source, such as 'entrez'):
ll<-ft_search(query = 'ebird', from = 'scopus', limit=2650)
However, when I try to run ft_get(ll), I get the message "the following not supported and will be skipped: scopus".
I also run into problems when I try to retrieve the abstracts using the following:
DOIs <- ll$scopus$data[,12]
nn <- ft_abstract(x = DOIs, from = "scopus", scopusopts = list(key = "keynumber"))
It takes about 30s until I get the message "Error: The resource specified cannot be found."
I know that fulltext is no longer available on CRAN (got it from GitHub) and that it is no longer supported. However, I was under the impression, from all the reading I have done online, that it is possible to use scopus as a source with ft_abstract(). Has anything changed recently?
Thanks!
Dom
don't know what's specifically going on here, but in general, turn on verbose curl output crul::set_verbose()
to seed curl headers. Check to see if your API key is still valid or not https://dev.elsevier.com/ And they also check IP addresses, so you have to be on your VPN probably if you're not on campus or similar if you're at a company
Hi Scott,
Many thanks for your speedy reply - much appreciated, especially given you are no longer supporting the package.
API key is valid and VPN was on. I turned on the verbose curl output (thanks for that suggestion!) and saw that the issue came from the vector of DOIs containing NAs. After removing those, ft_(abstract) ran partly on the first try and completely on the second. Problem solved - thank you!
Of 2,623 papers found with ft_search(), 2,378 have DOIs. ft_abstract() returned abstracts for 2,305 DOIs.
Question (if you have time): how do I identify the 73 DOIs for which no abstract was returned? I see in the manual that it's possible to return errors with ft_get() [e.g. res$elife$errors] but that doesn't seem to be the case for ft_abstract().
I plan to get the abstract for papers without DOIs using another identifier as explained in the section 7.2 of the manual.
RE ft_get(): I still get the same error message if I try to run ft_get(ll). However, it does run if I input the vector of DOIs without NAs. Unfortunately, it stalls relatively quickly and I get the error message:
Error in if (is.null(x) || is.na(x)) y else x :
missing value where TRUE/FALSE needed
I looked at the list of possible errors in the manual and there doesn't appear to be a mention of this one.
Wondering if you've seen this before and there is a quick fix?
Thanks a lot,
Dom
don't the DOIs that have no abstract still show up in the output of ft_abstract
? like
[[23]]
[[23]]$doi
[1] "10.1371/journal.pgen.1006173"
[[23]]$abstract
[1] ""
if so, then just filter the output of ft_abstract
to those list elements that have no abstract, something like
Filter(function(x) !nzchar(x$abstract), out$scopus)
Good point! Yes, they do. Thanks
Weirdly, however, Filter(function(x) !nzchar(x$abstract), out$scopus)
returns a list of length 0
Whereas Filter(function(x) nzchar(x$abstract), out$scopus)
returns a list of length 2305 as expected
I haven't had to work with lists of lists (the output of ft_abstract) much... any idea why this option insn't working? I haven't found an alternative so far...
I'd inspect one of the list elements with no abstract. Perhaps where i get ""
you are getting something else
This repository is about to be archived.
If you develop a related package, it might be in scope for https://ropensci.org/software-review/