To setup the scraper, you need to follow the following steps.
First, a .env
file at the top-level directory needs to exists. It should look like:
ACCOUNT_USERNAME=<TODO>
PASSWORD=<TODO>
MFA_TOKEN=<TODO>
GOOGLE_SHEETS_CREDENTIALS=<TODO> // The keys.json file for the Google sheets service, as a string.
Note that MFA_TOKEN is needed for Mint accounts with 2-FA enabled, and corresponds to the token that Google Authenticator uses. With this token, the app can generate OTP on its own. If not provided, we will fall-back to text-based method and this will require user interaction.
Additionally, you need a keys.json
file containing valid keys to be used when accessing the Google Spreadsheet. This is just a JSON file that you should be able to download from the Google Cloud Console. You need to attach in the GOOGLE_SHEETS_CREDENTIALS
section above as a single string.
Note that we rely on pipenv
to automatically load the variables from .env
into your environment. If you are not using pipenv
, you will need to load them in some other way.
As of the latest update, we recommend leveraging pipenv
and pyenv
to maintain a hermetic static for dependencies. The other options are left here only for reference as they are not maintained/tested often.
On Mac, make sure you have Homebrew installed. You can install pipenv
and pyenv
with:
brew install pipenv
brew install pyenv
After installing, you navigate to the root of the project directory, and run:
pipenv install
This will install all the required dependencies as well as the appropriate Python version (using pyenv). You can then run:
pipenv run python scraper.py --type='all'
To run the script with these installs, or you can jump into the installed Python environment at the shell bevel with:
pipenv shell
For development purposes, you want to also install the dev dependencies by running pipenv install -d
.
You should be able to type check by running:
pipenv run mypy .
You can run all unit tests with the command:
pipenv run pytest
You can run all formatting with the command:
pipenv run black .
The first thing you want to do is install flyctl
. You can do this on Mac trivially if you have brew
installed using:
brew install flyctl
We migrated our Heroku stack. We now leverage a custom-built Docker container that gets deployed to fly.io
for our purposes.
The deployment of the container should happen automatically using Github actions whenever you push the repo. Just leverage that.
We recommend you build it remotely. This happens by default when simply running. You can install flyctl
with brew install flyctl
:
flyctl deploy --remote-only
You can build this container locally and run it:
docker build -t mint_scraper .
Once built, you can test locally by running the image. Note that it might fail due to binary incompatibitlies between the driver versions.
docker run --env-file=.env -e mint_scraper:latest python scraper.py --type='all'
If you're running into issues, you want to debug by ssh'ing into the machine.
- Download and install Wireguard.
- Run
flyctl wireguard create
and use the output config for a new tunnel in Wireguard. Activate this tunnel. - Run
flyctl ssh issue --agent
to populate a 24hr certificate in your local agent. - RUn
flyctl ssh console --app mint-scraper-fly
For the last command, you can replace mint-scraper-fly
with the name of the app. This will connect to a running instance of mint-scraper-fly
using a basic shell. You can now debug to your heart's content.
3.0.0