olivettigroup/article-downloader

Query Construction

Closed this issue · 5 comments

This looks interesting, but I'm not sure how to construct queries properly. How should the queries in the csv be constructed to retrieve DOI's from Elsevier, say?

The queries just represent strings that you might search for normally (e.g. "Li-ion battery" or "cancer treatment"); that is to say, queries are just strings of keywords. All you need to do is enter one query per line, such that each line ends in a comma. Hopefully the example in the README is useful - let me know if further explanation would be useful.

Thank you. In that case I'm still getting a Traceback:

from articledownloader.articledownloader import ArticleDownloader
downloader = ArticleDownloader()
queries = downloader.load_queries_from_csv("my_queries.csv")
Traceback (most recent call last):
File "", line 1, in
File "c:\anaconda\lib\site-packages\articledownloader\articledownloader.py", line 76, in load_queries_from_csv
csvf.seek(0)
AttributeError: 'str' object has no attribute 'seek'

I'll post an image of "my_queries.csv" below:
my_queries

Thanks for the catch - there was a mistake in the README. You're supposed to pass a CSV file to that method, not the name of the file. So, your line should read something like:

queries = downloader.load_queries_from_csv(open("my_queries.csv", "r"))

The README has been updated to reflect this. Let me know if that works for you.

Thank you. One quick follow up. In the README should dois = set(dois) read dois = set(piis) ? I assume so, because we don't have a dois object. If the latter is the case, I'm told piis is not hashable when I call set().

It looks like piis is a list that contains a set; have you already deduped the piis?

Thanks again for that catch. Yeah, the dois = set(dois) line was superfluous. I've updated the README so that it should be correct now. Please let me know if you run into any other errors.