Dataflow Test Environment

以下範例為建置供 Google Cloud Dataflow template JDBC to BigQuery 的前置架構建置,主要目的是供測試 Column Schema 內有 BigQuery 不接受的特殊字元時,利用

  1. Alias
  2. custom template 進行解決

環境包含

  • VPC
  • Cloud SQL (MySQL)
  • BigQuery

MySQL demo.sql 的 database 為 test,主要用來測試的 table 為 demo

Prerequisites

Ensure you have Python 3 and the Pulumi CLI.

We will be deploying to Google Cloud Platform (GCP), so you will need an account. If you don't have an account, sign up for free here. In either case, follow the instructions here to connect Pulumi to your GCP account.

This example assumes that you have GCP's gcloud CLI on your path. This is installed as part of the GCP SDK.

Running the Example

After cloning this repo, cd into it and run these commands.

  1. Create a new stack, which is an isolated deployment target for this example:

    $ pulumi stack init dev
  2. Set the required configuration variables for this program:

    $ pulumi config set gcp:project [your-gcp-project-here]
    $ pulumi config set --secret dbPassword [your-database-password-here]

    This shows how stacks can be configurable in useful ways. You can even change these after provisioning.

  3. Deploy everything with the pulumi up command.

    $ pulumi up
  4. Import a SQL dump file to Cloud SQL for MySQL

Notice may have BUG -> ERROR: (gcloud.sql.import.sql) HTTPError 403: The service account does not have the required permissions for the bucket. Try use Console. For more detail about import a SQL dump file to Cloud SQL for MySQL, see the informaimport a SQL dump file to Cloud SQL for MySQtion

```
$ gcloud sql import sql [your-database-name] [gs://bucket-name/sql-file]
```
  1. Create Dataflow use JDBC to BigQuery

  2. Once you are done, you can destroy all of the resources, and the stack:

    $ pulumi destroy
    $ pulumi stack rm