Plog is short for "Parse Log", it's designed for handling and analyzing log file generated from Apache, nginx etc.Actually,the project only completes the commen parts of log-dealing ,like the deal-interval-control,the thread-control and so on.
Inspired by FlumeNG, I divied the project into three parts,source,channel and sink. The following is a demo configuration file.
[xluren@test Plog]$ cat plog.conf
[source]
#the type of in put ,we will support file,exec,TCP
#the file type config
source_type=file
source_module=tail_log
source_path=/data0/logs/scribe/servicehttpd.log
########the exec config#########
#source_type=exec
#source_module=run_exec
#source_cmd=tail -f /var/log/httpd/access_log
########the tcp config###########
#source_type=TCP
#source_module=read_server
#source_port=8899
[channel]
channel_module=filter_log
channel_filter_regex=([\w\d.]{0,})\s([0-9.]+)\s(\d+)us\s(.*)\s\[([^\[\]]+)\s\+\d+\]\s"((?:[^"]|\")+)"\s(\d{3})\s(\d+|-)\s"((?:[^"]|\")+|-)"\s(.+|-)\s"((?:[^"]|\")+)"\s"(.+|-\s.+|-)"(.*)
#key of the filtered value
channel_dict_key=domain_name,ip,response_time,item1,date_time,request_url,response_code,size,ref,item3,agent,item2,item4
[sink]
#interval between two outputs
interval=60
#the datetime formate in the input
datetime_format=%d/%b/%Y:%H:%M:%S
######sink_type
sink_type=file
sink_file=/tmp/hello
sink_module=sink_out
#####sink to zabbix
#sink_type=zabbix
#sink_file=/tmp/zabbix_send_info
#sink_module=send_zabbix
####sink to mysql
#sink_type=mysql
#sink_module=send_mysql
[log_config]
#this module i use logging config,refer:https://docs.python.org/2/howto/logging.html
logging_format=%(asctime)s %(filename)s [funcname:%(funcName)s] [line:%(lineno)d] %(levelname)s %(message)s
#####################
#Level Numeric value
#CRITICAL 50
#ERROR 40
#WARNING 30
#INFO 20
#DEBUG 10
#NOTSET 0
logging_level=20
logging_filename=/tmp/hello_Plog
I use ConfigParse to parse the configue file.
####The Part of Source In this part,we should deal with the input ,its type may be file ,exec,TCP socket ,and so on ,but I dont care about the type of ur input,you just need to complete a function like the folllowing :
def yield_line(source_option_dict):
The function has only one arg,its type is a dict,the dict contains all the options u need.Its output's type is a yield iter.
####The Part of Channel In this part,you should complete a function like followings
def parse_str(source_iter,channel_option_dict,dict_queue):
It has three args,
- source_iter,which generated by souce part;
- channel_option_dict,which contains all the option you need ,also you should configue what options you need in plog.conf
- dict_queue,it is a queue to contain the output,the item in the queue is a dict.
The function of parse_str will through the source iter and parse the string ,then the parsed value will be combined with the channel_dict_key into a dict,and put into the dict_queue
####The Part of Sink In this part,you should complete a function like followings I will complete it soon
In China,we name Nov 11 as "Single Day",but in the day of Nov 11 2014,I fix the bug of plog and also restruct it . So today I am very happy,event more than that I have a girlfriend :-)