marianogappa/chart

Line graphing requests with grouping

dustinblackman opened this issue · 2 comments

Overview

I'm attempting to use chart to graph large number of outbound requests. My data structure my log tool outputs looks like so:

{"timestamp":"2017-08-07T23:59:21.434927285Z"}
{"timestamp":"2017-08-07T23:59:21.4703704Z"}
{"timestamp":"2017-08-07T23:59:21.797074466Z"}
{"timestamp":"2017-08-07T23:59:21.891435198Z"}
{"timestamp":"2017-08-07T23:59:21.993443695Z"}

Using chart, I was expecting like the following to produce a line graph that groups the entries and increases the Y axis. I'm obviously doing something wrong as instead I get a flat line.

screen shot 2017-08-09 at 1 17 09 pm

The command I had come up with to handle the format my logs output is the following:

cat test.json | jq -r .timestamp | awk '{print $0 "\t" "1"}' | chart line log --date-format 2006-01-02T15:04:05.999999999Z

Is grouping on the y axis something that chart supports? Or is it something that should be done before hand? If anyone would like to play with something like this themselves, I've uploaded a ~200000 lines of logs in the above format. http://www38.zippyshare.com/v/xAMPdzeg/file.html

@dustinblackman thanks for considering chart!

We could add that, but it gets tricky and complex when evaluating the options. What's the bucket you want to aggregate to? Seconds/Minutes/? Is the aggregation strictly the "plus" operation? Instead, I think it's simpler to do this with bash as a preprocessing (no need to learn tool parameters).

Let's say you want to aggregate per second:

Remove the subsecond content with sed, do the awk and then use this group by function

cat test.json | jq -r .timestamp | sed 's/\(.*\)\..*/\1/g' | awk '{print $0 "\t" "1"}' | groupby

Where groupby is just another awk

groupby ()
{
    awk '{arr[$1]+=$2}END {for (key in arr) printf("%s\t%s\n", key, arr[key])}' | sort -nk1,1
}

I recommend using the "Stamp" sugar for rsyslog-related date time formats with no subsecond

cat test.json | jq -r .timestamp | sed 's/\(.*\)\..*/\1/g' | awk '{print $0 "\t" "1"}' | groupby | chart line log --date-format Stamp

Does this work for you?

If suddenly many people end up needing this use case, we should definitely consider adding some parameter. Before that, I'd err on the side of simplicity.

Please let me know what you think. Cheers.

Oh these helper functions are beautiful, this is a great way to accomplish what I was looking for with chart! I think I like the approach more that data should be "massaged" first before entering chart, it was wrong of me to think otherwise. Thanks for the help!