/geocode-job

The process for comparing vista data with agrc addresses data in the google cloud

Primary LanguagePython

Geocode-job

Create a containerized python geocoding script as a kubernetes job.

Steps to run

  1. Prepare data
    1. Run prep_addresses.py
      • Pulls data from VISTA
      • Partions the data into multiple CSVs
  2. Create k8s job yaml specifications
    1. Run vista_job_template.py
  3. Apply seceret for service worker with cloud storage permissions to k8s cluster
    1. authorize kubectl with geocoding api cluster
    2. run kubectl apply -f .secrets/gcs-secret.yml
      • Service account key must first be base64 encoded into gcs-secret.yml
  4. Apply job yamls to cluster
    1. run kubectl apply -f job.yaml
  5. Download geocoded CSVs from cloud storage

Steps to build

  1. Build container from docker file
    1. docker build . -t {container name}
    2. docker tag {container name}:latest gcr.io/{project id}/webapi/{container name}:latest
  2. Push to registery
    1. docker push gcr.io/{project id}/webapi/{container name}:latest
    2. User needs project permissions to allow push to gcr