This project requires the usage of this repository and this one for deployment. This repository is a webscraper and the other one is for a REST API. Besides these, you will also need a DB. I've hosted mine on CloudSQL with Postgres and added the default user's password as a Secret on Cloud Secret Manager (will be used for the REST API part). For this repository, though, the only changes that need to be applied are inside the config.py file.
When deployed to GCRun, the environment variable GOOGLE_APPLICATION_CREDENTIALS is set automatically, so in order to deploy locally you must have a valid GCP credentials file with enough permissions inside the Docker container for it to be referenced. You must also confirm that in the config.py you have the environment set to dev
. The step by step guide is as follows:
- Make sure that the environment variable GOOGLE_APPLICATION_CREDENTIALS is going to be set corrently by placing the GCP Service Account credentials file
keyfile.json
together with the other files in the root folder (along with the Dockerfile, app.py, etc.). It is going to be copied to the container and be referenced in the code to access GCS, BigQuery, Secret Manager. You must also make sure it has enough permissions for the resources it is going to access. cd
to the repository's root directory and rundocker build -t <IMAGE_NAME> .
, such asdocker build -t scraper .
- Execute
docker run -p 8000:8080 <IMAGE_NAME>
, such asdocker run -p 8000:8080 scraper
- Access the Flask routes by going to
http://127.0.0.1:8000/<route_name>
To deploy to the Cloud Run service, there's no need to set credentials since the environment variable is set automatically (see here!).
- Clone this repository and
cd
into the root folder. - Make sure you're logged into your GCloud project and run
gcloud builds submit --tag gcr.io/<PROJECT_ID>/<IMAGE_NAME> .
. This will build the container using Cloud Build. - Deploy to Cloud Run using
gcloud beta run deploy <GCR_INSTANCE_NAME> --image gcr.io/<PROJECT_ID>/<IMAGE_NAME> --region southamerica-east1 --platform managed --allow-unauthenticated --quiet
. Be aware of the allow-unauthenticated flag: anyone can access the endpoint so it's best to secure it as a next step.