Need detailed explanation of the possible issue using "*" in tail plugin with log rotate
yantang-msft opened this issue · 6 comments
Can you give a concrete example of why I shouldn't use * with log rotate? How will the log get duplicated?
https://github.com/fluent/fluentd-docs/blob/master/docs/v1.0/in_tail.txt#L73
Basically, I want to monitor all files under a directory, and the path is something like "/var/log/*.log".
My understanding is if the rotation just create a new file under this folder, data are written to the new file, and the tail plugin will just pick the log from the new file.
data are written to the new file, and the tail plugin will just pick the log from the new file.
yes and in_tail also starts to read from rotated file if rotated file name is matched with "/var/log/*.log".
For example,
- read foo.log
- rotate happens: rename foo.log to bar.log, create foo.log.
- in_tail reads foo.log and bar.log.
@repeatedly , thank you for the explanation. In your example, the "read_from_head" property needs to be true so the logs in bar.log will be duplicated. If read_from_head is false, there will be no log duplication when the file is rotated correct?
If I'm understanding correctly, it's only when I need to set read_from_head to true, then I'll have to separate the in_tail plugin?
@yantang-msft
No. I don't believe you can prevent record duplication just by disabling read_from_head
.
The fundamental problem is that Fluentd cannot tell if a given log file is "archived" one
or not. It will just follow any log file that matches the filename pattern, and just process
the content as like others (regardless if the file is created by log-rotation or not).
For example, suppose you have the following files in your log directory:
$ ll log/*.log
-rw-r--r-- 1 nginx nginx 0 Apr 23 15:30 log/app.1.log
-rw-r--r-- 1 nginx nginx 0 Apr 23 15:30 log/app.log
If you tell Fluentd to observe all the files in the directory (log/*.log
), Fluentd will
try to follow both app.log
AND app.1.log
. As a result, you are very likely to end up
receiving duplicated records.
Actual example
You can check the behaviour by executing the following commands:
# Add some log lines
$ echo log:1 >> app.log
$ echo log:2 >> app.log
# Rotate the log file
$ mv app.log app.1.log
# Add more log lines
$ echo log:3 >> app.log
$ echo log:4 >> app.log
This results in duplicated records as follows, even if read_from_head
is false:
2018-04-23 15:36:24 +0900 [info]: #0 following tail of /var/log/app.log
2018-04-23 15:36:24 +0900 [info]: #0 following tail of /var/log/app.1.log
2018-04-23 15:36:24 +0900 [info]: #0 fluentd worker is now running worker=0
2018-04-23 15:36:28.547307420 +0900 debug.log: {"message":"log=1"}
2018-04-23 15:36:28.547329329 +0900 debug.log: {"message":"log=2"}
2018-04-23 15:36:28 +0900 [info]: #0 detected rotation of /var/log/app.log; waiting 5 seconds
2018-04-23 15:36:28 +0900 [info]: #0 following tail of /var/log/app.log
2018-04-23 15:36:28.549788238 +0900 debug.log: {"message":"log=3"}
2018-04-23 15:36:28.549806195 +0900 debug.log: {"message":"log=4"}
2018-04-23 15:36:28 +0900 [info]: #0 detected rotation of /var/log/app.1.log; waiting 5 seconds
2018-04-23 15:36:28 +0900 [info]: #0 following tail of /var/log/app.1.log
2018-04-23 15:36:28.550586838 +0900 debug.log: {"message":"log=1"}
2018-04-23 15:36:28.550592989 +0900 debug.log: {"message":"log=2"}
Note
IMHO, if you really want to avoid record duplication, you should just move the rotated
log files somewhere out of the sight of Fluentd (rotating app.log
to app.log.1
,
instead of app.1.log
should be fine).
@fujimotos I can reproduce your example to have duplicate logs. However, this only happens when app.1.log
already exists in the position file of the file input. Otherwise there will be no duplication. The behavior is really mysterious to me.
And I also don't understand this sentence
The fundamental problem is that Fluentd cannot tell if a given log file is "archived" one or not.
What does "archived" mean? And what's the difference whether it's archived or not? I guess it's related to the position file and is the key to explain the above different behavior of having duplicate / no duplicate logs.
However, this only happens when app.1.log already exists in the position file of the file input.
Otherwise there will be no duplication. The behavior is really mysterious to me.
It is ... actually complicated.
-
If "app.1.log" does not exist in advance, Fluentd will treat it as a freshly created file and try
to process it according to theread_from_head
option. I presume you've set this option to
false, so the content of the file was not (re-)processed. -
On the other hand, in my original example, Fluentd thought "app.1.log" had been
overwritten by another file. For such a case, Fluentd has a special logic to handle it
and will process the file from the head (regardless the settings ofread_from_head
).
To test this behaviour, execute the following commands in your log directory:
# Create a new log file
$ echo 'log:xxx' > new.log
# Overwrite an existing log file
$ echo 'log:yyy' > new.log.tmp
$ mv new.log.tmp new.log
You'll see log:yyy
being emitted, but not log:xxx
.
The crux is that read_from_head
option is not guaranteed to prevent log duplication.
What does "archived" mean? And what's the difference whether it's archived or not?
I guess it's related to the position file and is the key to explain the above different
behavior of having duplicate / no duplicate logs.
I mean, 'rotated' files (like app.1.log
).
We human easily see, say, 'app.1.log' is an inactive (=archived) log file created
by some log rotation program and can treat it as such. But Fluentd is just not smart
enough to notice the fact, thus cannot handle these files intelligently.
For example, the log duplication issue is totally avoidable IF fluentd could think
like: "Well 'app.1.log' seems like a rotated log file, so I need to be careful not
to reprocess lines I had already emitted".
But Fluentd cannot think like that, so we'd better move these rotated log files out of its
sight in order to prevent Fluentd from duplicating the same log content.
@fujimotos Thank you for the detailed explanation! Yes, it's quite complicated.
Basically, if app.log is rotated as app.1.log, and new records are written to app.log, it's expected to read from head of app.log.
While in our scenario, we rotate the file once per day, thus the files get rotated are always 'new' file, and there is no last_inode for that file, thus duplication shouldn't happen.
It's good that you pointed this out in the doc so I learned more about this plugin. I'll close the issue. Let me know if there are other corner cases that can cause duplication.