/taipei-bi-etl

DEPRECATED

Primary LanguagePython

taipei-bi-etl

Docker image for taipei-bi team ETL tasks.

Getting Started

  1. Install docker
  2. cp settings.py.sample settings.py to customize your settings
  3. Build docker image docker build -t taipei-bi-etl-img . don't forget the ending dot.
  4. Run docker instance from the image docker run taipei-bi-etl-img

It will run in tempfs (in-memory storage), and won't left anything on your harddrive.

  • If you need to check/persist the intermediate output on your file system,
    you can run with bind mount.
    docker run -v {path on your filesystem}:/app/data taipei-bi-etl-img
  • If you only need a persist inetermediate output, use volumes.
  • To be able to authenticate gcloud when testing locally,
    you may want to mount your local gcloud config:
    docker run -v {path of your ~/.config folder}:/root/.config -v {path on your filesystem}:/app/data taipei-bi-etl-img
  • Note that the default timezone of the Docker container is UTC,
    use -e run option to adjust the container timezone:
    docker run -e="Asia/Taipei" taipei-bi-etl-img
  • For other available ETL options (e.g. specify date rage), run docker run taipei-bi-etl-img --help
  • A full example of the docker command would be like:
    docker build -t taipei-bi-etl-img . && docker run -v /Users/eddielin/taipei-bi-etl/data:/app/data -v /Users/eddielin/.config:/root/.config --name taipei-bi-etl taipei-bi-etl-img --task revenue --step e --source google_search
  • For more options of the etl tasks, run:
    docker build -t taipei-bi-etl-img . && docker run -v /Users/eddielin/taipei-bi-etl/data:/app/data -v /Users/eddielin/.config:/root/.config --name taipei-bi-etl taipei-bi-etl-img --help

Technical References