/Plog

Primary LanguagePythonMIT LicenseMIT

Plog

Plog is short for "Parse Log", it's designed for handling and analyzing log file generated from Apache, nginx etc.Actually,the project only completes the commen parts of log-dealing ,like the deal-interval-control,the thread-control and so on.

Inspired by FlumeNG, I divied the project into three parts,source,channel and sink. The following is a demo configuration file.

[xluren@test Plog]$ cat plog.conf
[source]
#the type of in put ,we will support file,exec,TCP
#the file type config
source_type=file
source_module=tail_log
source_path=/data0/logs/scribe/servicehttpd.log
########the exec config#########
#source_type=exec
#source_module=run_exec
#source_cmd=tail -f /var/log/httpd/access_log
########the tcp config###########
#source_type=TCP
#source_module=read_server
#source_port=8899

[channel]
channel_module=filter_log
channel_filter_regex=([\w\d.]{0,})\s([0-9.]+)\s(\d+)us\s(.*)\s\[([^\[\]]+)\s\+\d+\]\s"((?:[^"]|\")+)"\s(\d{3})\s(\d+|-)\s"((?:[^"]|\")+|-)"\s(.+|-)\s"((?:[^"]|\")+)"\s"(.+|-\s.+|-)"(.*)
#key of the filtered value
channel_dict_key=domain_name,ip,response_time,item1,date_time,request_url,response_code,size,ref,item3,agent,item2,item4

[sink]
#interval between two outputs
interval=60
#the datetime formate in the input
datetime_format=%d/%b/%Y:%H:%M:%S

######sink_type
sink_type=file
sink_file=/tmp/hello
sink_module=sink_out
#####sink to zabbix
#sink_type=zabbix
#sink_file=/tmp/zabbix_send_info
#sink_module=send_zabbix


####sink to mysql
#sink_type=mysql
#sink_module=send_mysql

[log_config]
#this module i use logging config,refer:https://docs.python.org/2/howto/logging.html
logging_format=%(asctime)s %(filename)s [funcname:%(funcName)s] [line:%(lineno)d] %(levelname)s %(message)s
#####################
#Level      Numeric value
#CRITICAL   50
#ERROR      40
#WARNING    30
#INFO       20
#DEBUG      10
#NOTSET     0
logging_level=20
logging_filename=/tmp/hello_Plog

I use ConfigParse to parse the configue file.

####The Part of Source In this part,we should deal with the input ,its type may be file ,exec,TCP socket ,and so on ,but I dont care about the type of ur input,you just need to complete a function like the folllowing :

def yield_line(source_option_dict):

The function has only one arg,its type is a dict,the dict contains all the options u need.Its output's type is a yield iter.

####The Part of Channel In this part,you should complete a function like followings

def parse_str(source_iter,channel_option_dict,dict_queue):

It has three args,

  • source_iter,which generated by souce part;
  • channel_option_dict,which contains all the option you need ,also you should configue what options you need in plog.conf
  • dict_queue,it is a queue to contain the output,the item in the queue is a dict.

The function of parse_str will through the source iter and parse the string ,then the parsed value will be combined with the channel_dict_key into a dict,and put into the dict_queue

####The Part of Sink In this part,you should complete a function like followings I will complete it soon


In China,we name Nov 11 as "Single Day",but in the day of Nov 11 2014,I fix the bug of plog and also restruct it . So today I am very happy,event more than that I have a girlfriend :-)