ropensci/aRxiv

Should aRxiv be using the OPI-PMH interface rather than the simpler API?

kbroman opened this issue · 0 comments

The arXiv API has some limitations; e.g., even with repeated requests of different slices, you may not be able to get all manuscripts matching a particular search. See, for example, this response on the arxiv-api google group, which notes that an initial search is cached and subsequent calls will just give subsets of that initial search.

They suggest using slices of time, but then they suggest that the OPI-PMH interface would be better for larger downloads.

Haven't looked at the OPI-PMH thing yet. I suspect it's better but more complicated.