
crawler proxy base on gcp.

Primary LanguageHCL

GCP crawler proxy


This project is a crawler proxy project base on GCP. Use Squid, the forword proxy server as the docker image running on VM group, and use GCP tcp proxy load balancer as the access entrypoint.

Build and deploy

  1. build sqiud forword proxy docker image.
$ ./script/build_docker_push.sh
  1. create tf backend gcs and add backend.cof
$ ./script/create_state_bucket.sh 


bucket  = "<tf backend gcs>"
prefix  = "<terraform state prefix>"
  1. init terraform
$ terraform init -backend-config=backend.conf
  1. create terraform.tfvars to apply your env.
project_id = "<GCP PROJECT ID>"
service_account = "<SVC NAME>@<GCP PROJECT ID>.iam.gserviceaccount.com"
region = "us-central1"
target_size = 100
  1. apply terraform
$ terraform apply


import requests

proxies = {
  'https': '', # your gcp tcp proxy address

res = requests.get('https://ifconfig.me/', proxies=proxies)


  1. https://harry-lin.blogspot.com/2019/05/docker-azuredockersquid-proxy.html
  2. https://github.com/sameersbn/docker-squid#configuration
  3. https://medium.com/google-cloud/squid-proxy-cluster-with-ssl-bump-on-google-cloud-7871ee257c27
  4. https://cloud.google.com/load-balancing/docs/tcp/setting-up-tcp#configuring_the_load_balancer

Open in Cloud Shell