datetime2 format as tracking collumn with MSSQL server

Question

datetime2 format as tracking collumn with MSSQL server

jborovi opened this issue 5 years ago · 0 comments

The bug:
Using datetime2 format as tracking_collumn with logstash jdbc-input plugin and MSSQL server, results in duplicated data.

config:

input {
    jdbc {
    	jdbc_connection_string => "jdbc:sqlserver://127.0.0.1\SQLExpress;database=TestDB;user=test;password=Heslo@111;portNumber=1433"
        jdbc_user => nil
        jdbc_validate_connection => true
        jdbc_driver_library => "/usr/share/logstash/logstash-core/lib/jars/mssql-jdbc-7.4.1.jre11.jar"
        jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
        schedule => "* * * * *"
        statement => "SELECT id_column
        	from changelog where t_datetime2 > :sql_last_value"
        tracking_column=>"t_datetime2"
		record_last_run => true
		tracking_column_type => "timestamp"
}
}

output{
redis {
    host => "127.0.0.1"
    key => "logstash-test"
    data_type => "list"
  }
}

screenshots from database input and redis output shows there is 398800 rows in input and 401791 results in output

Screenshot of min and max t_datetime2 values

screenhots of actualy queries jdbc-input used to get the data from database

Sql queries are loosing the precision and this may result in duplicated data.

I am using following docker logstash image
logstash/logstash:7.3.2

mssql server image
microsoft/mssql-server-linux

Tried to search forum for similar issue, only found this comment describing same problem.
#140 (comment)

Sample data
db definition
dbdefinition.txt

dataimport
dataimport.txt

Steps to reprduce:

docker-compose -f docker-compose-config.yml down
docker-compose -f docker-compose.config.yml up --build
docker exec -it rediscontainerid redis-cli llen logstash-test
observe more rows inserted into output than excpected, this number is random but never less than inserted count in input database

OS debian 9
Version logstash 7.3.2

Thank you for response.