pdf-text-analysis

This tool is used for

~~text processing and analysis of multiple folders of multiple PDF files~~ (currently not working)
Combining search results from multiple journal databases and annotating articles with journal rankings

Problem

Google Scholar too often returns "unscholarly" articles.
Using "scholarly" journal databases results in duplicates and combining multiple search results into a single spreadsheet

If a researcher had two research questions:
RQ1: How are literature reviews automated?
RQ2: What are the meta concepts of literature reviews?

Then search keyword sets would look something like:
SKS1: (literature AND review) AND (automated)
SKS2: (literature AND review) AND (meta)

Searching for these SKSs across five journal databases would result in 10 result sets which would then have to be:

By using this tool, these 10 result sets still must be searched and downloaded, but steps 1-3 are now automated.

Download this codebase
Go to webofknowledge.com and search for (literature AND review) AND (automated).
Click on "Export" and download the Excel file of the results.
Create a folder in input called sks1
Move your downloaded file into sks1
Repeat the previous on scopus.com
Repeat the previous with the search (literature AND review) AND (meta) and folder name sks2
Run the program
Open combined_searches.xlsx and see that your search keyword set results have been combined, duplicates have been removed, journal rankings have been assigned, and the data has been normalized.

Check that python and git are properly installed. If not, google how to do that.
Check that pip is installed. If not, google how to do that.
Git clone this project
Run pip install -r requirements.txt
Run python3 runner.py