logstash-plugins/logstash-input-file

logstash-input-file sincedb skips some files on Windows

dscheg opened this issue · 1 comments

identifier_from_path for some files fails to get file id (value is unknown) because CreateFileW returns ERROR_FILE_NOT_FOUND and GetFileInformationByHandle returns ERROR_INVALID_HANDLE after that.

  • Version: logstash-input-file-4.1.9 (logstash-6.6.1)
  • Operating System: Windows 10, Windows 2016
  • Config File:
input {
    file {
        path => ["E:/Directory/Logs/Log_*"]
        start_position => "beginning"
        sincedb_path => "E:/Logstash/state/sincedb"
    }
    ...
}
  • Debug logs:
[WARN ][filewatch.tailmode.processor] >>> Rotation In Progress - inode change detected and original content is not fully read, file is closed and path points to new content { ... @filename='Log_2019-02-27_2019-02-27' ... @sincedb_key='unknown 0 0'>"}
[WARN ][filewatch.tailmode.processor] >>> Rotation In Progress - inode change detected and original content is not fully read, file is closed and path points to new content { ... @filename='Log_2019-02-28_2019-02-28' ... @sincedb_key='unknown 0 0'>"}
[WARN ][filewatch.tailmode.processor] >>> Rotation In Progress - inode change detected and original content is not fully read, file is closed and path points to new content { ... @filename='Log_2019-02-29_2019-02-29' ... @sincedb_key='unknown 0 0'>"}

Some files are mapped to same id unknown and skipped.

I’m not so familiar with Ruby FFI, but it seems there must be a zero-terminated UTF-16LE string here:

def self.open_handle_from_path(path)
CreateFileW(in_buffer(path), 0, 7, nil, 3, 128, nil)
end
def self.in_buffer(string)
utf16le(string)
end

Concatenating with 0.chr fixes the problem.

Same issues closed without solution:
https://discuss.elastic.co/t/logstash-does-not-start-process-files/164228
https://discuss.elastic.co/t/logstash-cant-read-some-files/143847
https://discuss.elastic.co/t/logstash-sincedb-unknown/155972

@dscheg
Nice debugging. Thanks.
But this does not happen to all paths, only some. At this time I don't know what the difference is between the ones that work and the ones that don't.