Source: https://dev.to/googlecloud/using-headless-chrome-with-cloud-run-3fdp
pip3 install -r requirements.txt
(You'll likely need to manually update the version of chromedriver-binary
to latest)
(You may need to enable long paths if you're using WSL or Windows directly.)
(You'll probably need to install some other dependencies if using WSL [see the RUN
entries in the Dockerfile
])
- Satisfy pre-reqs for running locally (immediately above)
- To start the server, see the
CMD
entry in theDockerfile
.
cd {PATH}:\{TO}\{DIR_WITH_DOCKERFILE}
docker build --tag scraper:1.0 .
docker run -e PORT=8080 --publish 8081:8080 --interactive --name scraper scraper:1.0
docker rm scraper
NOTE: The sections below assume you've installed and configured gcloud.
gcloud builds submit --tag gcr.io/YOUR_PROJECT/web-scraping-service
gcloud run deploy web-scraping-service --image gcr.io/YOUR_PROJECT/deploy web-scraping-service --region <REGION> --platform managed
- "We're using --no-sandbox to ensure compatibility with the Docker container, so only point such a service towards URLs you trust."
- "Be careful when exposing such a service to user input: For example, if the URL we were screenshotting was supplied by the user, they could potentially take a screenshot of any file on the filesystem as well!"
- "Be sure to create a new service account with no permission and use it as the identity of the service, for better security. See https://cloud.google.com/run/docs/securing/service-identity for an example."