get_publication 404 errors
gnk02 opened this issue · 1 comments
I have a long data frame of authors with their scholar ids. Running get_publications sequentially on this list produces random 404 warnings, and NA is returned. For example:
get_publications("ri2FkCgAAAAJ")
[1] NA
Warning message:
In get_scholar_resp(url) :
Page 404. Please check whether the provided URL is correct.
It is strange that this error is random. For example, if i have a table with about 1000 authors, the first error may appear for the author in line 150, and all previous authors are processed without problems. After some time, i may run the code again and the problem may appear for some other author.
I also tried introducing random wait times between each search, but the problem persists.
Any ideas?
What i basically want to do is derive the number of publications of these authors in a certain time period (in years) and the number of cites to these publications.
I also attach a csv file with the list of authors.
author_table_scholar.csv
The code i use to get the number of publications and the number of citations is the following:
function that gets the scholar id of an author and a range in years and
returns the number of citations of all publications in that range
citations_in_years <- function (id, start_year, end_year) {
pubs_in_range <- get_publications(id) %>% filter(!is.na(year) & year>=start_year & year<=end_year)
sum(pubs_in_range$cites)
}
#returns the number of papers in a given time range
papers_in_years <- function (id, start_year, end_year) {
pubs_in_range <- get_publications(id) %>% filter(!is.na(year) & year>=start_year & year<=end_year)
nrow(pubs_in_range)
}
#add them to author_table
author_table <- author_table %>% rowwise() %>%
mutate(num_of_papers_2016_2020=papers_in_years(scholar_id,2016,2020),
cites_of_papers_2016_2020=citations_in_years(scholar_id,2016,2020))
Unfortunately I'm calling get_publications twice but i couldn't do it another way with mutate.