fluent/fluentd-docs

Need detailed explanation of the possible issue using "*" in tail plugin with log rotate

yantang-msft opened this issue · 6 comments

Can you give a concrete example of why I shouldn't use * with log rotate? How will the log get duplicated?
https://github.com/fluent/fluentd-docs/blob/master/docs/v1.0/in_tail.txt#L73

Basically, I want to monitor all files under a directory, and the path is something like "/var/log/*.log".
My understanding is if the rotation just create a new file under this folder, data are written to the new file, and the tail plugin will just pick the log from the new file.

data are written to the new file, and the tail plugin will just pick the log from the new file.

yes and in_tail also starts to read from rotated file if rotated file name is matched with "/var/log/*.log".
For example,

  1. read foo.log
  2. rotate happens: rename foo.log to bar.log, create foo.log.
  3. in_tail reads foo.log and bar.log.

@repeatedly , thank you for the explanation. In your example, the "read_from_head" property needs to be true so the logs in bar.log will be duplicated. If read_from_head is false, there will be no log duplication when the file is rotated correct?
If I'm understanding correctly, it's only when I need to set read_from_head to true, then I'll have to separate the in_tail plugin?

@yantang-msft
No. I don't believe you can prevent record duplication just by disabling read_from_head.

The fundamental problem is that Fluentd cannot tell if a given log file is "archived" one
or not. It will just follow any log file that matches the filename pattern, and just process
the content as like others (regardless if the file is created by log-rotation or not).

For example, suppose you have the following files in your log directory:

$ ll log/*.log
-rw-r--r-- 1 nginx nginx 0 Apr 23 15:30 log/app.1.log
-rw-r--r-- 1 nginx nginx 0 Apr 23 15:30 log/app.log

If you tell Fluentd to observe all the files in the directory (log/*.log), Fluentd will
try to follow both app.log AND app.1.log. As a result, you are very likely to end up
receiving duplicated records.

Actual example

You can check the behaviour by executing the following commands:

# Add some log lines
$ echo log:1 >> app.log
$ echo log:2 >> app.log

# Rotate the log file
$ mv app.log app.1.log

# Add more log lines
$ echo log:3 >> app.log
$ echo log:4 >> app.log

This results in duplicated records as follows, even if read_from_head is false:

2018-04-23 15:36:24 +0900 [info]: #0 following tail of /var/log/app.log
2018-04-23 15:36:24 +0900 [info]: #0 following tail of /var/log/app.1.log
2018-04-23 15:36:24 +0900 [info]: #0 fluentd worker is now running worker=0
2018-04-23 15:36:28.547307420 +0900 debug.log: {"message":"log=1"}
2018-04-23 15:36:28.547329329 +0900 debug.log: {"message":"log=2"}
2018-04-23 15:36:28 +0900 [info]: #0 detected rotation of /var/log/app.log; waiting 5 seconds
2018-04-23 15:36:28 +0900 [info]: #0 following tail of /var/log/app.log
2018-04-23 15:36:28.549788238 +0900 debug.log: {"message":"log=3"}
2018-04-23 15:36:28.549806195 +0900 debug.log: {"message":"log=4"}
2018-04-23 15:36:28 +0900 [info]: #0 detected rotation of /var/log/app.1.log; waiting 5 seconds
2018-04-23 15:36:28 +0900 [info]: #0 following tail of /var/log/app.1.log
2018-04-23 15:36:28.550586838 +0900 debug.log: {"message":"log=1"}
2018-04-23 15:36:28.550592989 +0900 debug.log: {"message":"log=2"}

Note

IMHO, if you really want to avoid record duplication, you should just move the rotated
log files somewhere out of the sight of Fluentd (rotating app.log to app.log.1,
instead of app.1.log should be fine).

@fujimotos I can reproduce your example to have duplicate logs. However, this only happens when app.1.log already exists in the position file of the file input. Otherwise there will be no duplication. The behavior is really mysterious to me.

And I also don't understand this sentence

The fundamental problem is that Fluentd cannot tell if a given log file is "archived" one or not.

What does "archived" mean? And what's the difference whether it's archived or not? I guess it's related to the position file and is the key to explain the above different behavior of having duplicate / no duplicate logs.

@yantang-msft

However, this only happens when app.1.log already exists in the position file of the file input.
Otherwise there will be no duplication. The behavior is really mysterious to me.

It is ... actually complicated.

  1. If "app.1.log" does not exist in advance, Fluentd will treat it as a freshly created file and try
    to process it according to the read_from_head option. I presume you've set this option to
    false, so the content of the file was not (re-)processed.

  2. On the other hand, in my original example, Fluentd thought "app.1.log" had been
    overwritten by another file. For such a case, Fluentd has a special logic to handle it
    and will process the file from the head (regardless the settings of read_from_head).

To test this behaviour, execute the following commands in your log directory:

# Create a new log file
$ echo 'log:xxx' > new.log

# Overwrite an existing log file
$ echo 'log:yyy' > new.log.tmp
$ mv new.log.tmp new.log

You'll see log:yyy being emitted, but not log:xxx.

The crux is that read_from_head option is not guaranteed to prevent log duplication.

What does "archived" mean? And what's the difference whether it's archived or not?
I guess it's related to the position file and is the key to explain the above different
behavior of having duplicate / no duplicate logs.

I mean, 'rotated' files (like app.1.log).

We human easily see, say, 'app.1.log' is an inactive (=archived) log file created
by some log rotation program and can treat it as such. But Fluentd is just not smart
enough to notice the fact, thus cannot handle these files intelligently.

For example, the log duplication issue is totally avoidable IF fluentd could think
like: "Well 'app.1.log' seems like a rotated log file, so I need to be careful not
to reprocess lines I had already emitted".

But Fluentd cannot think like that, so we'd better move these rotated log files out of its
sight in order to prevent Fluentd from duplicating the same log content.

@fujimotos Thank you for the detailed explanation! Yes, it's quite complicated.
Basically, if app.log is rotated as app.1.log, and new records are written to app.log, it's expected to read from head of app.log.
While in our scenario, we rotate the file once per day, thus the files get rotated are always 'new' file, and there is no last_inode for that file, thus duplication shouldn't happen.
It's good that you pointed this out in the doc so I learned more about this plugin. I'll close the issue. Let me know if there are other corner cases that can cause duplication.