日志采集未按照预期
ysj2018 opened this issue · 2 comments
ysj2018 commented
Relevant config.toml
我在配置文件logs.toml里配置
[logs]
## just a placholder
api_key = "ef4ahfbwzwwtlwfpbertgq1i6mq0ab1q"
## enable log collect or not
enable = true
## the server receive logs, http/tcp/kafka, only kafka brokers can be multiple ip:ports with concatenation cha
racter ","
#send_to = "127.0.0.1:17878"
send_to = "127.0.0.1:9092"
## send logs with protocol: http/tcp/kafka
send_type = "kafka"
topic = "flashcatlog"
## send logs with compression or not
use_compress = false
## use ssl or not
send_with_tls = false
## send logs in batchs
batch_wait = 5
## save offset in this path
run_path = "/var/categraf/run"
## max files can be open
open_files_limit = 100
## scan config file in 10 seconds
scan_period = 10
## read buffer of udp
frame_size = 9000
## channal size, default 100
## 读取日志缓冲区,行数
chan_size = 1000
## pipeline num , default 4
## 有多少线程处理日志
pipeline=4
## configuration for kafka
## 指定kafka版本
kafka_version="5.4.0"
# 默认0 表示串行,如果对日志顺序有要求,保持默认配置
batch_max_concurrence = 0
# 最大并发批次, 默认100
batch_max_size=100
# 每次最大发送的内容上限 默认1000000
batch_max_contentsize=1000000
# client timeout in seconds
producer_timeout= 10
# 是否开启sasl模式
sasl_enable = false
sasl_user = "admin"
sasl_password = "admin"
# PLAIN
sasl_mechanism= "PLAIN"
# v1
sasl_version=1
# set true
sasl_handshake = true
# optional
# sasl_auth_identity=""
#
##
# v0.3.39以上版本新增,是否开启pod日志采集
enable_collect_container=false
# 是否采集所有pod的stdout stderr
collect_container_all = true
## glog processing rules
# [[logs.Processing_rules]]
## single log configure
[[logs.items]]
## file/journald/tcp/udp
type = "file"
## type=file, path is required; type=journald/tcp/udp, port is required
path = "/var/categraf/categraf-v0.3.73-linux-amd64/conf/test.log"
source = "sys"
service = "my_service_log"
执行 ./categraf --start
Logs from categraf
我这样配置,希望每次更新日志都只采集更新的日志而不是全量,但实际每次更新都又重新采集了日志文件里的所有记录,这是不符合预期的,请告诉我有可能是什么原因造成了这种现象?上面有我关于日志采集的配置
System info
categraf 0.3.73 , Linux tvvmdc0065 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Docker
No response
Steps to reproduce
1.我按上述配置配置了logs.toml
2.启动categraf
3.我创建了日志文件 test.log,然后添加几条日志
4.我去kafka里查数据发现所有的日志记录都被采集
5,然后我又添加日志记录
6.发现在kafka里又重新采集了所有的日志记录,而不是只采集新增的记录
7.我检查了register.json,看到offset是有更新的
8.我再次添加日志记录,发现又采集了一次所有的记录
Expected behavior
我理解这个日志采集,记录了offset,应该是每次只采集新增的日志才对
Actual behavior
实际上,我每次日志文件有更新,都是采集了日志的所有记录。
Additional info
No response
kongfei605 commented
- /var/categraf/run/registry.json 会记录采集的偏移
- 不知道你是如何追加内容到日志的, ls -i 可以看看文件inode是不是每次都当成了一个新文件
- 不知道你是如何查看kafka消息的,不要用--from-beginning 参数
ysj2018 commented
好的 谢谢您的回答,问题已解决,经过测试,就是inode发生了变化。非常感谢