/fluent-plugin-bigquery-storage-write

Fluentd output plugin to insert data into Google BigQuery through storage write API

Primary LanguageRubyApache License 2.0Apache-2.0

fluent-plugin-bigquery-storage-write

Test Gem Version

Fluentd output plugin to insert data into BigQuery through storage write api.

Overview

Google Cloud Bigquery output plugin for Fluentd. The main difference from fluent-plugin-bigquery is that it uses BigQuery new API called Storage Write API.

Advantages of using the Storage Write API are described here.

Installation

RubyGems

gem install fluent-plugin-bigquery-storage-write

Bundler

Add following line to your Gemfile:

gem "fluent-plugin-bigquery-storage-write"

And then execute:

bundle

Configuration

bigquery_storage_write_insert

name type required? default description
auth_method enum yes application_default private_key or json_key or compute_engine or application_default
email string yes (private_key) nil GCP Service Account Email
private_key_path string yes (private_key) nil GCP Private Key file path
private_key_passphrase string yes (private_key) nil GCP Private Key Passphrase
json_key string yes (json_key) nil GCP JSON Key file path or JSON Key string
project string yes nil
dataset string yes nil
table string yes nil
ignore_unknown_fields bool no true If False, raise errors for unknown fields.
proto_schema_rb_path string yes nil Generated Protocol Buffers schema .rb file path.
proto_message_class_name string no nil Class name of Protocol Buffers message. If not specified, table value that converted to pascal case is used.

buffer section

name type required? default description
@type string no memory
chunk_limit_size integer no 1MB
total_limit_size integer no 1GB
chunk_records_limit integer no 500
flush_mode enum no interval default, lazy, interval, immediate
flush_interval float no 1.0
flush_thread_interval float no 0.05
flush_thread_burst_interval float no 0.05

And, other params (defined by base class) are available

see. https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/output.rb

Examples

First, you have to generate Protocol Buffers compiled code to serialize data. Write code .proto and compile it using protoc. The sample code with BigQuery schema is located in the path below proto/test_data.proto.

protoc -I proto --ruby_out=proto proto/test_data.proto

Next, specify generated ruby code path to fluentd configuration file.

<match test>
  @type bigquery_storage_write_insert

  auth_method application_default

  project sample-project
  dataset test
  table data

  proto_schema_rb_path /your/generated/code/path/here/test_data_pb.rb
  proto_message_class_name Data
</match>

Tips

Copyright

  • Copyright(c) 2023 gumigumi4f
  • License
    • Apache License, Version 2.0
  • This plugin includes some code from fluent-plugin-bigquery for compatibility.