logpai/logparser

Inconsistency with Templates and the EventTemplates in the Spark2k_corrected version file

WahomeKezia opened this issue · 4 comments

Hii there !

I wanted ask and clarify about eventTemplates with urls and file paths (using Spark logs)

E6 | Connecting to driver: spark://<*>

E12 | Input split: hdfs://<*>

E25 | Saved output of task 'attempt_<>' to hdfs://<>

E23 | Remoting started; listening on addresses :[akka.tcp://<*>]

I have noted on the corrected version, the logs in the structured csv have different templates from the eventTemplate csv file

eg.. here is a log , eventTemplate label and the template
Input split: hdfs://10.10.34.11:9000/pjhe/logs/2kSOSP.log:21876+7292 | E12 | Input split: <*>

Sorry, I did not understand your problem.

Hii @zhujiem ,
Using eventTemplate 12 as an example ,

Input split: hdfs://10.10.34.11:9000/pjhe/logs/2kSOSP.log:21876+7292 | E12 | Input split: <*>

It's eventtemplate 12 , Input split: <*> and on this file Spark_2k.log_templates.csv

the EventTemplate 12 is slightly different

I have noted the same with EventsTemplate 6,25 and 25 ,
Connecting to driver: spark://CoarseGrainedScheduler@10.10.34.11:48069 | E6 | Connecting to driver: <*>

Is the correct template Input split: <*> or this one Input split: hdfs://<*> ?

In the loghub_2k_corrected, you should refer to *_structured_corrected.csv and *_templates_corrected.csv, which are the corrected versions.
So, E12 should be:

E12	Input split: <*>

Ooh ,I see . Thank you! @zhujiem