Backslash in input fields are duplicated before being sent to logstash
Opened this issue · 0 comments
Description
Backslashes are not parsed properly. I believe that LFV is adding an extra quote level and causing any backslash to be doubled up
Motivation
Testing grok patterns for windows paths on logstash 7.12
Exemplification
Using version 1.6.3
logstash config
input {
beats {
port => 5044
host => "0.0.0.0"
}
}
filter {
if [log][file][path] {
grok {
pattern_definitions => {
"FILE" => "[^/]*"
"WINFILE" => "[^\\]*"
}
match => {
"[log][file][path]" => [
"^%{GREEDYDATA:[log][file][dir]}/%{FILE:[log][file][name]}",
"^%{GREEDYDATA:[log][file][dir]}\\%{WINFILE:[log][file][name]}"
]
}
tag_on_failure => ["_path_parse_failure"]
}
}
}
I have this deployed and it is correctly parsing linux and windows paths
test case
---
fields:
log:
file:
path: 'C:\logs\current\foo.log'
ignore:
- "host"
- "fields"
- "@timestamp"
testcases:
- input:
- "2020-09-23 07:20:00.000000 | INFO | TEST"
expected:
- "log":
"file":
"path": 'C:\logs\current\foo.log'
"dir": 'C:\logs\current'
"file": 'foo.log'
"message": "2020-09-23 07:20:00.000000 | INFO | TEST"
...
results
{
"log": {
"file": {
- "dir": "C:\\logs\\current",
- "file": "foo.log",
- "path": "C:\\logs\\current\\foo.log"
+ "dir": "C:\\\\logs\\\\current\\",
+ "name": "foo.log",
+ "path": "C:\\\\logs\\\\current\\\\foo.log"
}
},
conclusion
So you can see that even log.file.path
which is included in yaml single quotes both in fields and in expected is some how getting an extra \
added (I accept that the display output is also doubling up on \
).
Look at log.dir
there is a \
at the end. If you think about the grok pattern it is clear that this backslash is being added at the input, because the grok pattern removes one backslash.
Running with 7.12 this grok pattern works with windows filebeats as the source.