ropensci/rcrossref

cr_works() gives the wrong paper

LilianYou opened this issue · 5 comments

I would like to use function cr_works() on a paper titled "Groundwater discharge dynamics into a salt marsh tidal river" to retrieve doi.

First, I tried to do it by calling query.title in cr_works():

cr_works(query.title = "Groundwater discharge dynamics into a salt marsh tidal river")
$meta
  total_results search_terms start_index items_per_page
1     106697429           NA           0             20

It gave me a list of wrong papers and the search term is NA.

Next, I tried to do it alternatively by listing the top 5 papers with the highest scores:

cr_works(q = trimws("Groundwater discharge dynamics into a salt marsh tidal river"), limit = 5, sort = "score")
$meta
  total_results                                                 search_terms start_index items_per_page
1       1138801 Groundwater discharge dynamics into a salt marsh tidal river           0              5

It gave me another list of wrong papers and the search term is "tidal" and "river" this time, so the doi it gives me is actually a totally different paper named "
Tidal discharge asymmetry in a salt marsh drainage system1,2"

Finally, I went to the crossref website and manually searched it. The paper exists and has doi:

https://search.crossref.org/?q=Groundwater+discharge+dynamics+into+a+salt+marsh+tidal+river


It's obvious that cr_works() gives the wrong paper

Session Info
Put the output of sessionInfo() here

thanks @LilianYou for the report! I edited your question a bit to separate the question from the session info area. Please include the output of running sessionInfo() or sessioninfo::session_info() in the section above

you need to pass field queries as a named list to the flq parameter - there's a few examples in the docs, e.g., try

cr_works(flq = list(query.title = "Groundwater discharge dynamics into a salt marsh tidal river"))

Hi @sckott,

I just tried the exact code you gave me. The correct paper shows up as the second paper on the cr_works() returned data list. The first one on the list is actually the second on crossref website search: https://search.crossref.org/?q=Groundwater+discharge+dynamics+into+a+salt+marsh+tidal+river

Is there a reason for the first and second paper position switch in cr_works() and crossref metadata search?

I don't know, that's up to how Crossref makes the search work.

You can try try refine your search if you pass other things, e.g., query.container-title, query.author, query.affiliation, etc.

I see. Thank you, Scott!