@timestamp erroneously added to events when omitted
jgough opened this issue · 6 comments
If the @timestamp
field is removed (for example with mutate { remove_field => [ "@timestamp" ] }
) then LFV adds it back in making tests fail. Tested on Beta 2 in daemon mode
Config:
input { stdin { id => "input" } }
filter { mutate { remove_field => [ "@timestamp" ] } }
output { stdout { } }
Testcase:
input_plugin: "input"
testcases:
- input:
- Test
expected:
- message: Test
Result:
☐ Comparing message 1 of 1 from test.yml:
--- /tmp/937366809/test.yml/1/expected 2021-10-12 15:30:42.730339850 +0000
+++ /tmp/937366809/test.yml/1/actual 2021-10-12 15:30:42.730339850 +0000
@@ -1,3 +1,4 @@
{
+ "@timestamp": "2021-10-12T15:30:42.641Z",
"message": "Test"
}
Summary: ☐ All tests: 0/1
☐ test.yml: 0/1
When run directly through logstash the above filter does not output a @timestamp
field.
Dockerfile to repro (including displaying logstash's output without the @timestamp
field):
# syntax=docker/dockerfile:1.3-labs
FROM docker.elastic.co/logstash/logstash:7.15.0
ENV LOGSTASH_FILTER_VERIFIER_VERSION v2.0.0-beta.2
USER root
RUN yum clean expire-cache && yum update -y && yum install curl && yum clean all
ADD https://github.com/magnusbaeck/logstash-filter-verifier/releases/download/${LOGSTASH_FILTER_VERIFIER_VERSION}/logstash-filter-verifier_${LOGSTASH_FILTER_VERIFIER_VERSION}_linux_386.tar.gz /opt/
RUN tar xvzf /opt/logstash-filter-verifier_${LOGSTASH_FILTER_VERIFIER_VERSION}_linux_386.tar.gz -C /opt \
&& mv /opt/logstash-filter-verifier /usr/bin/
USER logstash
RUN <<EOF
mkdir tests
mkdir pipeline/pipeline1
mkdir pipeline/pipeline2
cat <<EOT > /usr/share/logstash/config/pipelines.yml
- pipeline.id: pipeline
path.config: "pipeline/pipeline.conf"
EOT
cat <<EOT > /usr/share/logstash/tests/test.yml
input_plugin: "input"
testcases:
- input:
- Test
expected:
- message: Test
EOT
cat <<EOT > /usr/share/logstash/pipeline/pipeline.conf
input { stdin { id => "input" } }
filter { mutate { remove_field => [ "@timestamp" ] } }
output { stdout { } }
EOT
cat <<EOT > /usr/share/logstash/run_tests.sh
echo "Running directly through Logstash"
echo "Test" | logstash -f /usr/share/logstash/pipeline/pipeline.conf
echo "Starting daemon..."
logstash-filter-verifier daemon start &
sleep 5
logstash-filter-verifier daemon run --pipeline /usr/share/logstash/config/pipelines.yml --pipeline-base /usr/share/logstash/ --testcase-dir /usr/share/logstash/tests/test.yml --add-missing-id
EOT
chmod a+x run_tests.sh
EOF
CMD ["/bin/bash", "/usr/share/logstash/run_tests.sh"]
Build and run the above Dockerfile with:
DOCKER_BUILDKIT=1 docker build --tag test .
docker run --rm test
This one is interesting. I searched the source of LFV and I was not able to find a location, where LFV operates on the @timestamp
field.
Could it be, that the @timestamp
field gets added by Logstash on the pipeline input plugin. Due to the fact that we use multiple pipelines in LFV 2, the event passes an other input plugin (the pipeline to pipeline input) after the filter-pipeline-under-test and this input plugin adds the @timestamp
field again?
To me, it looks like Logstash is adding the timestamp in the pipeline-to-pipeline communication.
This can be shown with the following Dockerfile:
# syntax=docker/dockerfile:1.3-labs
FROM docker.elastic.co/logstash/logstash:7.15.0
ENV LOGSTASH_FILTER_VERIFIER_VERSION v2.0.0-beta.2
USER root
RUN yum clean expire-cache && yum update -y && yum install curl && yum clean all
ADD https://github.com/magnusbaeck/logstash-filter-verifier/releases/download/${LOGSTASH_FILTER_VERIFIER_VERSION}/logstash-filter-verifier_${LOGSTASH_FILTER_VERIFIER_VERSION}_linux_386.tar.gz /opt/
RUN tar xvzf /opt/logstash-filter-verifier_${LOGSTASH_FILTER_VERIFIER_VERSION}_linux_386.tar.gz -C /opt \
&& mv /opt/logstash-filter-verifier /usr/bin/
USER logstash
RUN <<EOF
mkdir tests
mkdir pipeline/pipeline1
mkdir pipeline/pipeline2
cat <<EOT > /usr/share/logstash/config/pipelines.yml
- pipeline.id: pipeline1
path.config: "pipeline/pipeline1/pipeline.conf"
- pipeline.id: pipeline2
path.config: "pipeline/pipeline2/pipeline.conf"
EOT
cat <<EOT > /usr/share/logstash/tests/test.yml
input_plugin: "input"
testcases:
- input:
- Test
expected:
- message: Test
EOT
cat <<EOT > /usr/share/logstash/pipeline/pipeline1/pipeline.conf
input { stdin { id => "input" } }
filter { mutate { remove_field => [ "@timestamp" ] } }
output { pipeline { send_to => [ pipeline2 ] } }
EOT
cat <<EOT > /usr/share/logstash/pipeline/pipeline2/pipeline.conf
input { pipeline { address => pipeline2 } }
output { stdout { } }
EOT
cat <<EOT > /usr/share/logstash/run_tests.sh
echo "Running directly through Logstash"
echo "Test" | logstash &
sleep 30
pkill logstash
echo "Starting daemon..."
logstash-filter-verifier daemon start &
sleep 5
logstash-filter-verifier daemon run --pipeline /usr/share/logstash/config/pipelines.yml --pipeline-base /usr/share/logstash/ --testcase-dir /usr/share/logstash/tests/test.yml --add-missing-id
EOT
chmod a+x run_tests.sh
EOF
CMD ["/bin/bash", "/usr/share/logstash/run_tests.sh"]
I am not sure, if this can be considered a bug in LFV. I tend to argue, that this is a bug in Logstash it self.
AFAIK the @timestamp
field is special in that it isn't an attribute of the Event class and not part of the regular hashmap of fields (similar to how the @metadata
fields are in a separate hashmap). Unless you have cooperation from the output plugin I don't think you can omit the @timestamp
field.
I created elastic/logstash#13333, let us see, what the Logstash devs have to say about this issue.
Currently I see little chances, that this gets fixed upstream in Logstash.
But I also don't see a way to fix this in LFV, because using pipelines to execute the test cases is a key part of the new architecture for the daemon mode.
Therefore me personally, I would close this as "won't fix" and maybe the best thing we can do here is mention this somehow in the documentation.
@jgough, @magnusbaeck What do you think?
Annoying, but seems reasonable. Easy to work around by ignoring @timestamp.