Regex in transformer lost key body and him value
Opened this issue · 0 comments
What version of Loggie?
v1.4.1
Expected Behavior
I expect that the log will be parsed based on the regex mask, preserving the body key and its value, and correctly published in loki, as happens in normalize regex.
Actual Behavior
I have an incoming log message from a container with nginx:
10.100.73.139 - - [16/Jan/2024:12:58:59 +0000] "GET / HTTP/1.1" 200 45196 "-" "curl/7.81.0" "127.0.0.1"
I parse it using regex by mask:
^(?<host>\S+) - (?<user>\S+) \[(?<time>.*)\] "(?<method>\S+) (?<request_url>\S+) (?<request_http_protocol>\S+)" (?<status>\S+) (?<bytes_out>\S+) "(?<http_referer>[^"]*)" "(?<user_agent>[^"]*)"? "(?<ip>[^"]*)"?
Those. my crd interceptors looks like this:
apiVersion: loggie.io/v1beta1
kind: Interceptor
metadata:
name: nginx
spec:
interceptors: |
- type: transformer
actions:
- action: regex(message)
pattern: ^(?<host>\S+) - (?<user>\S+) \[(?<time>.*)\] "(?<method>\S+) (?<request_url>\S+) (?<request_http_protocol>\S+)" (?<status>\S+) (?<bytes_out>\S+) "(?<http_referer>[^"]*)" "(?<user_agent>[^"]*)"? "(?<ip>[^"]*)"?
ignoreError: false
By transferring loggie to debug and adding print()
, I see that my log message is parsed correctly, but the body key with its value disappears.
{"level":"info","time":"2024-01-16 13:31:54","caller":"/pkg/pipeline/pipeline.go:1139","message":"source ui-app-shell-8b56755d8-vmr5b/ui-app-shell/default interceptor chain: source->interceptor/maxbytes->interceptor/transformer->queue"}
{"level":"info","time":"2024-01-16 13:33:29","caller":"/pkg/interceptor/transformer/action/print.go:67","message":"event: {\"bytes_out\":\"45196\",\"user_agent\":\"curl/7.81.0\",\"host\":\"10.100.167.188\",\"time\":\"16/Jan/2024:13:33:29 +0000\",\"user\":\"-\",\"request_url\":\"/\",\"status\":\"200\",\"http_referer\":\"-\",\"ip\":\"127.0.0.1\",\"fields\":{\"namespace\":\"default\",\"nodeip\":\"10.10.12.22\",\"podid\":\"9e7a2e06-9b02-46a5-89b3-f4e20b3ba829\",\"podname\":\"ui-app-shell-8b56755d8-vmr5b\",\"logconfig\":\"ui-app-shell\",\"workloadname\":\"ui-app-shell\",\"cluster\":\"k8s-test\",\"workloadkind\":\"Deployment\",\"containername\":\"ui-app-shell\",\"nodename\":\"k8s-test-worker-a-2\"},\"request_http_protocol\":\"HTTP/1.1\",\"method\":\"GET\"}"}
And accordingly, nothing gets into my loki.
If I first copy the contents of the body:
in the message:
and parse the message, then parsing the log and publishing it in loki occurs correctly.
I also noticed that if you use the outdated normalize
regex, then there are no problems either. The log is parsed by mask and published in loki.
{"level":"info","time":"2024-01-16 13:54:44","caller":"/pkg/pipeline/pipeline.go:1139","message":"source ui-app-shell-8b56755d8-vmr5b/ui-app-shell/default interceptor chain: source->interceptor/maxbytes->interceptor/normalize->interceptor/transformer->queue"}
{"level":"info","time":"2024-01-16 13:54:54","caller":"/pkg/interceptor/transformer/action/print.go:67","message":"event: {\"request_http_protocol\":\"HTTP/1.1\",\"user\":\"-\",\"time\":\"16/Jan/2024:13:54:53 +0000\",\"user_agent\":\"curl/7.81.0\",\"method\":\"GET\",\"status\":\"200\",\"fields\":{\"workloadname\":\"ui-app-shell\",\"nodeip\":\"10.10.12.22\",\"nodename\":\"k8s-test-worker-a-2\",\"logconfig\":\"ui-app-shell\",\"cluster\":\"k8s-test\",\"containername\":\"ui-app-shell\",\"podname\":\"ui-app-shell-8b56755d8-vmr5b\",\"workloadkind\":\"Deployment\",\"namespace\":\"default\",\"podid\":\"9e7a2e06-9b02-46a5-89b3-f4e20b3ba829\"},\"ip\":\"127.0.0.1\",\"request_url\":\"/\",\"body\":\"10.100.167.188 - - [16/Jan/2024:13:54:53 +0000] \\\"GET / HTTP/1.1\\\" 200 45196 \\\"-\\\" \\\"curl/7.81.0\\\" \\\"127.0.0.1\\\"\",\"bytes_out\":\"45196\",\"http_referer\":\"-\",\"host\":\"10.100.167.188\"}"}
Steps to Reproduce the Problem
- Deploy loggie v1.4.1 with configuration:
loggie:
http:
enabled: true
port: 9196
monitor:
listeners:
filesource:
period: 10s
filewatcher:
period: 5m
pipeline:
period: 10s
queue:
period: 10s
reload:
period: 10s
sink:
period: 10s
logger:
enabled: true
period: 30s
reload:
enabled: true
period: 10s
discovery:
enabled: true
kubernetes:
cluster: k8s-test
containerRuntime: containerd
dynamicContainerLog: false
parseStdout: true
rootFsCollectionEnabled: false
podLogDirPrefix: /var/log/pods
typePodFields:
logconfig: "${_k8s.logconfig}"
namespace: "${_k8s.pod.namespace}"
nodename: "${_k8s.node.name}"
nodeip: "${_k8s.node.ip}"
podname: "${_k8s.pod.name}"
podid: "${_k8s.pod.uid}"
containername: "${_k8s.pod.container.name}"
containerimage: "${_k8s.pod.container.image}"
workloadkind: "${_k8s.workload.kind}"
workloadname: "${_k8s.workload.name}"
cluster: "k8s-test"
- Deploy the image with nginx and publish some static content, while using a mask for the log:
log_format main
'$remote_addr - $remote_user [$time_local] "$request" $status'
' $body_bytes_sent "$http_referer" "$http_user_agent"'
' "$http_x_forwarded_for"';
- When making requests to this container, collect a log from it using transformer