logstash-plugins/logstash-filter-fingerprint

Fingerprint doesn't capture millisecond difference in timestamps

Closed this issue · 2 comments

Copied from https://discuss.elastic.co/t/fingerprint-unable-to-fingerprint-timestamp/29855

The following logstash configuration will result in indentical fingerprints for messages with @timestamp field differing by their milliseconds, like:

2015-09-22T00:00:00.000Z

vs.

2015-09-22T00:00:00.001Z

filter {
  date {
    match => ['timestamp', 'ISO8601']
  }
  fingerprint {
    method => 'MD5'
    key => '00000000'
    target => 'fingerprint'
    source => [ '@timestamp' ] #problem
  }
}

output {
  stdout {
    codec => 'json'
  }
  elasticsearch {
    host => '127.0.0.1'
    cluster => 'logstash'
    document_id => '%{fingerprint}'
  }
}

edit: to add this is in logstash version 1.4.3

The timestamp should be special cased - the standard to_s output does not include millseconds.

Pipeline:

input {
  generator {
    lines => [
      "2015-09-22T00:00:00.000Z",
      "2015-09-22T00:00:00.001Z"
    ]
    count => 1
  }
}

filter {
  date {
    match => ['message', 'ISO8601']
  }
  fingerprint {
    method => 'MD5'
    key => '00000000'
    target => 'fingerprint'
    source => [ '@timestamp' ]
  }
}

output {
  stdout {
    codec => 'rubydebug'
  }
}

Output:

{
       "sequence" => 0,
     "@timestamp" => 2015-09-22T00:00:00.001Z,
       "@version" => "1",
    "fingerprint" => "e4efb382c7e80c6a7e531af331c8dbac",
        "message" => "2015-09-22T00:00:00.001Z"
}
{
       "sequence" => 0,
     "@timestamp" => 2015-09-22T00:00:00.000Z,
       "@version" => "1",
    "fingerprint" => "92b1d22dfe3246ba9615a9260ba0d17f",
        "message" => "2015-09-22T00:00:00.000Z"
}

Cannot reproduce.