The file will be processed again after logstash restart
cliffordsun opened this issue · 0 comments
- Version: logstash 6.4.2 , input plugin 4.1.5
- Operating System: linux
- Config File (if you have sensitive info, please remove it):
input{
file {
path => "/data/log/standard.log*"
type => "standard-log"
ignore_older => "8 hours"
max_open_files => 4095
}
}
- Sample Data:
- Steps to Reproduce:
- start logstash, process file /data/log/standard.log
- restart logstash or reload logstash.conf
- rename /data/log/standard.log to /data/log/standard.log.20190513-100000
- the file /data/log/standard.log.20190513-100000 will be process again from beginning instead of the last checked position
my issue relates to Resend the data from the beginning of a file with sincedb setting
My log file is named /data/log/standard.log, it will rotate when it is large enouth. It will be renamed with the current time as the suffix.
i find the SincedbValue
has a member variables name path_in_sincedb
which is designed for Read mode. In Tail mode, it will never be assigned except reloading the sincedb file. But the function associate
of SincedbCollection
will compare the name of watched file which is newly discovered with path_in_sincedb
. After the file rotate, it is actually different, and logstash will read the file from the beginning.
if sincedb_value.watched_file.nil?
# not associated
if sincedb_value.path_in_sincedb.nil?
handle_association(sincedb_value, watched_file)
logger.trace("associate: inode matched but no path in sincedb")
return true
end
if sincedb_value.path_in_sincedb == watched_file.path **<----- HERE !!!!!!**
# the path on disk is the same as discovered path
# and the inode is the same.
handle_association(sincedb_value, watched_file)
logger.trace("associate: inode and path matched")
return true
end
# the path on disk is different from discovered unassociated path
# but they have the same key (inode)
# treat as a new file, a new value will be added when the file is opened
sincedb_value.clear_watched_file
delete(watched_file.sincedb_key)
logger.trace("associate: matched but allocated to another")
return true
end
A file rotated means that a new file with the same name appears and this file must have been renamed, so cleaning up the path_in_sincedb
is a right thing to do in function process_rotation_in_progress
, and it can avoid my problem.
My pleasure that you can check whether it's reasonable or not.