datetime2 format as tracking collumn with MSSQL server
jborovi opened this issue · 0 comments
The bug:
Using datetime2 format as tracking_collumn with logstash jdbc-input plugin and MSSQL server, results in duplicated data.
config:
input {
jdbc {
jdbc_connection_string => "jdbc:sqlserver://127.0.0.1\SQLExpress;database=TestDB;user=test;password=Heslo@111;portNumber=1433"
jdbc_user => nil
jdbc_validate_connection => true
jdbc_driver_library => "/usr/share/logstash/logstash-core/lib/jars/mssql-jdbc-7.4.1.jre11.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
schedule => "* * * * *"
statement => "SELECT id_column
from changelog where t_datetime2 > :sql_last_value"
tracking_column=>"t_datetime2"
record_last_run => true
tracking_column_type => "timestamp"
}
}
output{
redis {
host => "127.0.0.1"
key => "logstash-test"
data_type => "list"
}
}
screenshots from database input and redis output shows there is 398800 rows in input and 401791 results in output
Screenshot of min and max t_datetime2 values
screenhots of actualy queries jdbc-input used to get the data from database
Sql queries are loosing the precision and this may result in duplicated data.
I am using following docker logstash image
logstash/logstash:7.3.2
mssql server image
microsoft/mssql-server-linux
Tried to search forum for similar issue, only found this comment describing same problem.
#140 (comment)
Sample data
db definition
dbdefinition.txt
dataimport
dataimport.txt
Steps to reprduce:
- docker-compose -f docker-compose-config.yml down
- docker-compose -f docker-compose.config.yml up --build
- docker exec -it rediscontainerid redis-cli llen logstash-test
- observe more rows inserted into output than excpected, this number is random but never less than inserted count in input database
OS debian 9
Version logstash 7.3.2
Thank you for response.