sematext/sematext-agent-docker

JSON Logs Are Not Recognized

trvsysadmin opened this issue · 10 comments

I have more or less the same issue as this: #12

My log source is a golang package called logrus and issues single line json that looks like this:

{"context":["STRING","STRING","STRING"],"level":"info","msg":"MESSAGESTRING","stacktrace":["BLAH1", "BLAH2", "BLAH3"],"time":"2018-03-06T12:24:48Z"}

I also cant seem to map the "time" to "@timestamp" using patterns.yml. My patterns file looks like this:

patterns:
 - #  Web Logs
  sourceName: !!js/regexp .*gateway.*
  match:
    - regex: !!js/regexp ^\{.*
      type: absgateway_logs
      fields: [context,level,msg,stacktrace,ts]
      dateFormat: YYYY-MM-DDTHH:mm:ssZ

However, in my logsene dashboard, the only json that gets reported is like this:

{
  "@timestamp": "2018-03-06T12:24:48.412Z",
  "message": "{"context":["STRING"],"level":"info","msg":"MESSAGESTRING","stacktrace":["BLAH1", "BLAH2", "BLAH3"],"time":"2018-03-06T12:24:48Z"}",
  "severity": "info",
  "host": "ip-1-2-3-4",
  "ip": "1.2.3.4",
  "logSource": "ab.cde.io/abs/absgateway:staging_r-absStagingStack-absGatewayStaging-1-hash_hash",
  "container_id": "hash",
  "image_name": "ab.cde.io/abs/absgateway:staging",
  "container_name": "r-absStagingStack-absGatewayStaging-1-hash",
  "logType": "docker",
  "container_hostname": "hash",
  "@timestamp_received": "2018-03-06T12:25:25.055Z",
  "logsene_original_type": "docker"
}

Maybe its relevant, I am using rancher.

Note that it works fine in logagent doing an echo "{json: blah blah}" | logagent --yaml . It parses it correctly. However the patterns file still doesnt work. "time" just gets pasted as a string AND a new @timestamp is added where it is just simply the time I typed the logagent command in.

Lastly, there's also this block of example surrounding json which is entirely confusing. Am I supposed to use the json block in tandem with the patterns block? Does the json block apply to ALL patterns regardless of logSource?

Not sure what I am doing wrong. Any help would be appreciated.

UPDATE

I tried to plug this in the docker /etc/logagent/patterns.yml - nope doesn't do anything. Works perfectly on my command line using normal logagent.

json:
  enabled: true
  autohashFields:
    time: true
  removeFields:
    - stacktrace
  transform: !!js/function >
    function (sourceName, parsedObject, config) {
      for (var i=0; i<config.removeFields.length; i++) {
        console.log('delete ' + config.removeFields[i])
        delete parsedObject[config.removeFields[i]]
      }
    }

Then I tried to attack the message field that shows up in the logsene like this

json:
  enabled: true
  autohashFields:
    message: true

UPDATE 2

Vanilla nginx logs work perfectly fine. These are non-json and simply space delimited. In logsene, I can see it fully structured with method, user_agent etc. etc. I just used the default patterns.yml file (i.e. I dont pass any pattern file argument during docker run).

Nope, logsene shows non-hashed message.

UPDATE 3

The following regex application worked finally:

patterns:
 - #  Gateway Logrus Logs
  sourceName: !!js/regexp .*gateway.*
  match:
    - regex: !!js/regexp \{"context":(.*),"level":(.*),"msg":(.*),"stacktrace":(.*),"time":(.*)\}
      type: absgateway_logs
      fields: [context,level,msg,stacktrace,ts]
      dateFormat: YYYY-MM-DDTHH:mm:ssZ

So essentially we mis-assumed that json structured logs are automatically parsed but instead they are to be treated like any other single line? It doesn't sound like that in the documents we have read so far.

Hi,

JSON logs should be recognized and parsed automatically. We just checked with Kibana container (producing JSON logs as well).

Could you provide an original log snipped captured with docker logs contaierName >& capture.log command to a file? We should check if the JSON line really starts with "{" and that we don't have special characters like colour codes in the terminal output.

You can add transform functions to patterns.yml to manipulate the parsed JSON. The "globalTransform" function will process all logs (Parsed by regex or JSON parser).

The following patterns.yml did work for me:

# post process all JSON input
json:
  enabled: true 
  debug: true
  removeFields:
    - stacktrace
    - msg
    - level
    - time
  mapFields: 
    msg: message
    level: severity
    time: '@timestamp'
  transform: !!js/function >
   function (sourceName, parsedObject, config) {
    // map fields
    Object.keys(config.mapFields).forEach(function (f) {
      if (parsedObject[f] !== undefined) {
        parsedObject[config.mapFields[f]] = parsedObject[f] 
        if (config.debug) {
         console.log('map ' + f + ' to ' + config.mapFields[f] + ': ' + parsedObject[config.mapFields[f]])
       }
      }
    })
    // remove fields
    for (var i=0; i<config.removeFields.length; i++) {
       if (config.debug) {
         console.log('delete ' + config.removeFields[i])
       }
       delete parsedObject[config.removeFields[i]]
    } 
   }

Running a test with logagent:

cat test2.log | logagent -f patterns.yml -y 
2018-03-07T09:29:01.247Z pid[85789] init filter: inputFilter
2018-03-07T09:29:01.248Z pid[85789] init filter: outputFilter
2018-03-07T09:29:01.248Z pid[85789] init plugins
2018-03-07T09:29:01.249Z pid[85789] ../lib/plugins/input/stdin
2018-03-07T09:29:01.266Z pid[85789] ../lib/plugins/output/stdout
2018-03-07T09:29:01.342Z pid[85789] Logagent initialized
map msg to message: MESSAGESTRING
map level to severity: info
map time to @timestamp: 2018-03-06T12:24:48Z
delete stacktrace
delete msg
delete level
delete time
context: 
  - STRING
  - STRING
  - STRING
logSource:  unknown
@timestamp: 2018-03-06T12:24:48Z
message:    MESSAGESTRING
severity:   info

2018-03-07T09:29:03.346Z pid[85789] terminate reason: stdin closed

Do you pass patterns.yml as volume like docker run -v $(pwd)/patterns.yml:/etc/logagent/patterns.yml ...?
What is the log output of Sematext Docker Agent when it starts?
Does it log a line with the message Loading custom log parser definitions: /etc/logagent/patterns.yml?

@trvsysadmin please try version 1.31.50, the regex used to identify JSON log lines did not work in docker context. The new version uses the latest Logagent parser having a fix for the problem. We tested with a container, simulating your logs and everything worked fine with v1.31.49 and 1.31.50.
We published v1.31.50 as "latest" image (see build status):
https://hub.docker.com/r/sematext/sematext-agent-docker/builds/

Thank you for reporting the issue.

Thanks for the very fast and detailed turn around. Sorry I took a full day to reply. I was testing it and didn't want to sent partial results.

The latest docker image works! But not for every log.

The only difference I can find between the one that works and the one that doesn't is that one has "context" made of an array of strings and the other has "context" made of an array of objects.

At the beginning I thought it was an issue of length, but then later realized its not.

Definition of work: shows up in Logsene. Either with hacky regex (see first post Update 3) or 1.31.50 image cases.

Does Not Work

{"context":[{"ctx":null,"err":{"code":14,"message":"ABCD Topology Address List is Entirely Empty."},"errmsg":"abc error: code = NotCompleted desc = ABCD Topology Address List is Entirely Empty."}],"level":"error","msg":"abc error: code = NotCompleted desc = ABCD Topology Address List is Entirely Empty.","stacktrace":null,"time":"2018-03-08T10:25:48Z"}

Works

{"context":["staging"],"level":"info","msg":"Gin Server Starting With This Runmode...","stacktrace":["main.runServer::150","main.main::70","runtime.main::198","runtime.goexit::2361"],"time":"2018-03-08T10:25:47Z"}

{"context":[{"ctx":null,"err":{"Code":400,"Message":"Validation failed at validation step. Please see detailed validation results attached.","ErrObject":{"Pawn":{"name":null,"foobar":null,"foobar":null,"foobar":null,"foobar":null,"foobar":{"foobar_exceeds":[{"value_given":"2","value_expected":"1","user_friendly_message":"foobar string"}]}},"foobar":{"name":null,"foobar":null,"foobar":null,"foobar":null,"foobar":null,"foobar":{"foobar_exceeds":[{"value_given":"2","value_expected":"1","user_friendly_message":"foobar failed."}]}}}},"errmsg":"foobar. Please see detailed validation results attached."}],"level":"error","msg":"foobar. Please see detailed validation results attached.","stacktrace":null,"time":"2018-03-08T10:29:39Z"}

This could be a problem with the mapping in Elasticsearch. The non-working message has an array with objects in the context property. The working message has an array with strings in the context property. Depending on the type you use first, the automatic mapping might not fit a later message.

otisg commented

But a person could change this in Sematext Logs by using Field Editor, no?

Changing the schema with the field editor would not help. The problem is the logs have different structure in the same property "context" (Elasticsearch is not like MongoDB or CouchDB taking all JSON docs schema free).
So the log structure should be the same, or the field "context" should be renamed in case of errors (which have a different structure) to "error_context" - this is something, which could be done with a transform function in the pattern definition :)
I just finished a test logging the 3 example messages. Some fail with mapping parser exceptions.

See my screenshots (327 logs correct, 319 with errors in mapping):
The failed message:
bildschirmfoto 2018-03-08 um 22 09 53

Mapping in field editor:
bildschirmfoto 2018-03-08 um 22 06 27

The accepted message:
bildschirmfoto 2018-03-08 um 22 20 11

Thanks @megastef and @otisg . So is it wrong to assume that ElasticSearch requires us to provide the schema in advance? Actually our context key is supposed to have dynamic schema of "anything" so we can't really do the context vs context_err - sometimes it might be an array of array of objects for example.

How does it relate to this: https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html

Thanks again for all the help! Please feel free to close the issue if this is no longer specific to the docker agent, I will follow up with you separately!

@trvsysadmin Elasticsearch creates a dynamic mapping (schema) based on the type of the first document received. So if the first doc contains a "context" object with a string array, the schema is created and objects with other structures fail later. We are aware of this limitation in Elasticsearch, therefore we use Apache Kafka in front of Elasticsearch and can catch such situations by converting the original doc to string and creating the "logsene_error" field with the original Elasticsearch Error message (mapping exception) - in that way you don't lose all information and you have a chance to fix the problem. Note Elasticsearch itself would simply reject the indexing request and you would not see anything.

I suggest that you focus on a few common fields or convert the context Object to JSON string (human readable). Here is the snipped for your patterms.yml file converting the context field to pretty JSON string, (stored in a field named "context_json_str") and removing the original "context" field:

# post process all JSON input
json:
  enabled: true 
  transform: !!js/function >
   function (sourceName, parsedObject, config) {
     // map time field to '@timestamp' 
     if (parsedObject.time) {
      parsedObject['@timestamp'] = parsedObject.time
      delete parsedObject.time
     }
     // avoid Elasticsearch mapping errors in the "context" field
     if (parsedObject.context !== undefined) {
        // convert dynamic Objects to pretty JSON String
        parsedObject.context_json_str = JSON.stringify(parsedObject.context, null, ' ')
        delete parsedObject.context
     }
   }

resulting in a structure like this (YAML output in Logagent):

level:       error
msg:         foobar. Please see detailed validation results attached.
stacktrace:  null
logSource:   unknown
@timestamp:  Fri Mar 09 2018 10:05:08 GMT+0100 (CET)
contex_json_str: 
  """
    [
     {
      "ctx": null,
      "err": {
       "Code": 400,
       "Message": "Validation failed at validation step. Please see detailed validation results attached.",
       "ErrObject": {
        "Pawn": {
         "name": null,
         "foobar": {
          "foobar_exceeds": [
           {
            "value_given": "2",
            "value_expected": "1",
            "user_friendly_message": "foobar string"
           }
          ]
         }
        },
        "foobar": {
         "name": null,
         "foobar": {
          "foobar_exceeds": [
           {
            "value_given": "2",
            "value_expected": "1",
            "user_friendly_message": "foobar failed."
           }
          ]
         }
        }
       }
      },
      "errmsg": "foobar. Please see detailed validation results attached."
     }
    ]
  """

The solution above allows at least proper indexing, having all common fields available.
When you search for errors you can find all text e.g. from context.[0].errmsg.

If you like to create a visualisation e.g. based on context.[0]. Code then I suggest to iterate over the context objects and extract the codes or any other relevant field into new properties.

I hope this helps!

@megastef Thanks for the awesome help and clarification!