/logparser

A python package of log parsers with benchmarks for log template extraction

Primary LanguagePython

Logparser

A python package of log parsers with benchmarks for log template/event extraction

Paper

If you use these parsers, please cite our paper using the following reference:

@Conference{He16DSN,
Title = {An Evaluation Study on Log Parsing and Its Use in Log Mining},
Author = {He, P. and Zhu, J. and He, S. and Li, J. and Lyu, R.},
Booktitle = {DSN'16: Proc. of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks},
Year = {2016}
}

Parsers

If you are not familiar with log parser, please check the Principles of Parsers
The codes are here.

Data

In data, there are 5 datasets for you to play with. Each dataset contains several text files.

  • rawlog.log: The raw log messages with ID. "ID\tword1 word2 word3"
  • template[0-9]+: The log messages belong to a certain template.
  • templates: The text of templates.

Quick Start

Input: A raw log file. Each line of the file follows "ID\tword1 word2 word3"
Output: Two parts. One is splitted log messages (only contains log ID) in different text files. The other is the templates file which contains all templates.

Examples: Before running the examples, please copy the parser source file to the same directory.

  • Example1: This file is used to evaluate the performance of LogSig. It iterates 10 times and record several important information (e.g., TP, FP, time). To play with your own dataset, you could modify the path and files name in the code. You should also modify the path for ground truth data in RI_precision
  • Example2: This file is a simple example to demonstrate the usage of LogSig. The usage of other log parsers is similar.


For SLCT, because it is based on the original C code, the running example is here. This program is platform-dependent because the .so files are only valid in Linux.

Documentation

[To be continued]