I have a lot of alerts for various research articles I am interested in with Google Scholar. This generates a lot of email each week that needs to be sifted. I developed this simple tool to help me. It can read emails from Google Scholar and www.researchgate.net for links to articles. You have to save the emails to .eml
format somewhere on your disk. Point the script to that folder and it will read and find all href
tags. It will deduplicate the links and list the links along with the description. Optionally, it can load PDF links directly in your browser or open a CSV list of links up in your favorite spreadsheet.
To get this project up and running from the repository, it uses Rye as the build/dependency manager. There are instructions for installing Rye on many different systems. This set of instructions are for Linux and windows. See the installation guide for other operating systems.
You have to download Rye to your system. Follow the installation guide for your operating system.
Why Rye? That is a good question. Python is a great language but it is tough to create a reproducible environment. You have to have the correct version of Python installed or available. You have to have the correct tools configured. If you are on Linux/BSD you have to make sure that your work doesn't mess up your system Python installation. It is fairly trivial if you are experienced, but annoying enough to have to do it over-and-over again. If you are new, it can be extremely difficult.
Rye takes care of handling the different versions of Python and managing the tools you need for a reproducible environment, particularly if you are doing cross-platform work.
For Linux, you can use the following:
curl -sSf https://rye-up.com/get | bash
There are also good guides to configuring Rye for your shell. Here is what I had to do to get it working in ZSH on my system.
Edit .zshrc:
vi ~/.zshrc
Add the following:
source "$HOME/.rye/env"
Restart the terminal and type rye. To add shell completion, you can:
mkdir $ZSH_CUSTOM/plugins/rye
rye self completion -s zsh > $ZSH_CUSTOM/plugins/rye/_rye
For windows, download the installer listed in the installation guide link.
rye self update
Once you have rye properly installed, you can run rye sync, to build (or update) the virtual environment.
Create/Update Virtual Environment
rye sync
NOTE: This needs to be run from within the repository. If you add new dependencies or modify the pyproject.toml you should run rye sync.
You can add the following alias to your .zshrc or .bashrc, or you can run the activate script directly:
# Python Virtual Environment Alias
alias activate="source .venv/bin/activate"
NOTE: On Windows, there is an activate.ps1, a PowerShell script that you can execute.
Execute the script:
extract "~/tmp/extract tbird email" --verbose --launch-pdf
OR
extract "~/tmp/extract tbird email" --verbose --launch-csv
Please refer to LICENSE.md.