logstash-plugins/logstash-output-google_bigquery

Error creating table. {:exception=>java.lang.NullPointerException}

Opened this issue · 8 comments

Hi all,

I've been testing the plugin to stream json based log content to bigquery for a while. As far as I've tested there is a minor glitch on creating tables dynamically. When it try to create table, it throws a NullPointerException and retry it again and again when batch_size or flush_interval_secs values are reached. Fortunately the table can be created in a few try and streaming begins. However, this situation leads to dropping some portion of the logs I think. Especially when the event per seconds increased to ~1000 eps.)

Also this problem happens every hour, since we use the date_pattern in a hourly fashion default. ("%Y-%m-%dT%H:00").

Although, dropped logs ratio relatively small (for example my batch size is 250 and generally first chunks are dropped), it's unacceptable for our use case.

Here is the details:

  • Version:
    6.5.4

  • Operating System:
    Amazon Linux 2 (logstash runs in a docker container)

  • Config File:

pipliene.yaml

- pipeline.id: whatever
  path.config: "/usr/share/logstash/pipeline/whatever.conf"
  pipeline.workers: 8
  pipeline.batch.size: 500
  pipeline.batch.delay: 10

/usr/share/logstash/pipeline/whatever.conf

input {
  file {
    path => "/logs/whatever*.log"
    discover_interval => 1
    mode => "tail"
    sincedb_write_interval => 1
    sincedb_path => "/usr/share/logstash/.sincedb_whatever"
    start_position => end
  }
}

filter {
  json {
    source => "message"
  }
  mutate {
    remove_field => ["message","@version","path","host","type", "@timestamp"]
  }
}

output {
   google_bigquery {
     project_id => whatever"
     dataset => "...."
     json_schema => {
       fields => [{
         name => "..."
         type => "INTEGER"
         mode => "NULLABLE"
       }, {
         name => "...."
         type => "INTEGER"
         mode => "NULLABLE"
      }, {
      .....................
       }]
     }
     json_key_file => "whatever"
     error_directory => "/tmp/bigquery-errors"
     date_pattern => "%Y-%m-%dT%H:00"
     table_prefix => "whatever"
     batch_size => 250
     skip_invalid_rows => true
     ignore_unknown_values => true
     flush_interval_secs => 30
   }
}
  • Sample Data:
[2019-01-14T10:19:14,739][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:16,307][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:16,900][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:17,290][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:17,699][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:18,015][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:18,099][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:18,986][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:19,109][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:19,164][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:19,500][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:19,645][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:20,070][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:20,148][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:20,224][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:21,074][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:21,152][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:21,316][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:21,390][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:21,514][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:21,656][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:22,119][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:22,217][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:22,276][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:22,366][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:23,140][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:23,386][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:24,179][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:24,287][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:24,357][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:24,426][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:24,528][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:24,680][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:25,237][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:25,316][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:25,398][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:26,300][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:26,397][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:26,552][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:27,328][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:27,436][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:27,482][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:27,665][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:27,803][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:28,393][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:28,462][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:28,549][ERROR][logstash.outputs.googlebigquery] Error creating table. {:exception=>java.lang.NullPointerException}
[2019-01-14T10:19:29,423][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:30,487][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
[2019-01-14T10:19:30,530][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to search_log_2019_01_14T10_00
[2019-01-14T10:19:30,794][INFO ][logstash.outputs.googlebigquery] Publishing 250 messages to call_log_2019_01_14T10_00
  • Steps to Reproduce:
    Just start to streaming to a blank bigquery dataset.

I'm having the same issue in Logstash 6.3.

@cagriersen-omma have you figured it out?

@tomleib Unfortunately I didn't. But good news is it seems this issue is not related to dropped logs.

@cagriersen-omma what caused the dropped logs?

Any news guys?

I recently hit this error as well. It's actually failing when it goes to check if the table already exists. I validated the dataset and table provided to this call were valid. This has me thinking it is actually an issue with the bigquery client.

On a side note, the bigquery library referenced is on version 1.24.1, which is outdated. Perhaps updating to the latest 1.x version would fix the problem? My machine isn't really setup for plugin development at the moment. Otherwise I'd try to validate this myself. If anyone else can do this, it would be much appreciated.

Hey, any updates just in case?

I got the same issue

Some news about this issue ?