shamil/graphout

graphite metric values passed by graphout are different than provided by graphite

dutchiechris opened this issue · 4 comments

it seems that graphout logs a value that is different than as returned by graphite

From Graphite:
[{"target": "netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy", "datapoints": [[25.33446831696655, 1461842400], [25.7734740653523, 1461842520], [24.3570468278944, 1461842640], [23.711759390607803, 1461842760], [20.9118678938947, 1461842880], [19.24948126962095, 1461843000], [17.3139677711872, 1461843120], [16.9410851940636, 1461843240], [18.42542053910585, 1461843360], [17.364125610583102, 1461843480], [17.9149330001911, 1461843600], [20.77903744127045, 1461843720], [24.8942426859944, 1461843840], [24.571219714625748, 1461843960], [24.243233581547052, 1461844080], [25.46816505205415, 1461844200], [39.16704786828345, 1461844320], [39.40435667720285, 1461844440], [36.336761017686854, 1461844560], [22.27124974191695, 1461844680], [18.248599238215952, 1461844800], [16.91937757664115, 1461844920], [17.973630250759697, 1461845040], [18.342401167242002, 1461845160], [18.70834590831095, 1461845280], [18.0758846361824, 1461845400], [18.6187341950368, 1461845520], [31.877802993933052, 1461845640], [28.141148342675102, 1461845760], [32.63457091455205, 1461845880], [24.5021316754224, 1461846000], [17.8243685593147, 1461846120], [16.870662293467298, 1461846240], [16.56807217429585, 1461846360], [18.128912716322553, 1461846480], [17.182529652129297, 1461846600], [19.72172020432835, 1461846720], [18.298302881740298, 1461846840], [25.06567155973265, 1461846960], [37.972886343796546, 1461847080], [36.64555787314655, 1461847200], [23.7312807510069, 1461847320], [22.4088348458965, 1461847440], [23.09373883044265, 1461847560], [16.236113348037648, 1461847680], [16.26398928346125, 1461847800], [17.06447199255425, 1461847920], [16.961864672827247, 1461848040], [19.62282370327315, 1461848160], [20.3817087678584, 1461848280]]}]

From /tmp/logoutput.log:

2016-04-28 20:17:47 severity="info" result="25.5372" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:18:47 severity="info" result="25.5465" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:19:47 severity="info" result="25.5524" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:20:47 severity="info" result="25.5496" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:21:47 severity="info" result="25.5442" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:22:47 severity="info" result="25.5406" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:23:47 severity="info" result="25.5363" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:24:48 severity="info" result="25.5319" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:25:48 severity="info" result="25.5261" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:26:48 severity="info" result="25.5212" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:27:48 severity="info" result="25.519" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min"
 until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:28:48 severity="info" result="25.5176" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:29:48 severity="info" result="25.5154" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:30:48 severity="info" result="25.5133" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:31:48 severity="info" result="25.5123" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:32:48 severity="info" result="25.5123" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:33:48 severity="info" result="25.5112" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:34:48 severity="info" result="25.5098" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:35:48 severity="info" result="25.5085" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:36:48 severity="info" result="25.5146" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:37:48 severity="info" result="25.5271" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:38:43 severity="info" result="25.5378" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:43:47 severity="info" result="25.5603" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:45:53 severity="info" result="25.5607" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:46:18 severity="info" result="25.5659" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:47:18 severity="info" result="25.5654" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:48:18 severity="info" result="25.5656" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:49:18 severity="info" result="25.5641" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:50:06 severity="info" result="25.5592" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:51:06 severity="info" result="25.5538" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:52:06 severity="info" result="25.5468" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:53:06 severity="info" result="25.5413" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:54:32 severity="info" result="25.5352" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:55:42 severity="info" result="25.5266" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"

my "/etc/graphout/graphout.json" file:

/**
 * example Graphout configuration file
 */
{
    // graphite-web options
    "graphite_url": "http://sdt-graphite.nltestlab.hq.netapp.com:81",

    // log file options
    "log_file": "/dev/stdout",
    "log_level": "debug",

    // query interval (in seconds)
    "interval": 60,

    // delay each query by consistent random of seconds
    // if enabled, delay between 1 second and the query interval
    "splay": false,

    // queries section
    "queries":
    {
        "sdt-cdot1-01.avg_processor_busy":
        {
            "query": "netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy",
            "from": "-1min",
            "until": "now"
        }
    },

    // outputs section
    "outputs":
    {
        "logfile":
        {
            "output": "./logoutput",
            "params": {
                "path": "/tmp/logoutput.log"
            }
        },
        "zabbix":
        {
                "output": "graphout-output-zabbix",
                "params":
                {
                        "host": "localhost",
                        "port": 10051,
                        "target": "monitor",
                        "namespace": "graphout"
                }
        }
    }
}

Am I doing something wrong here or is there a bug?

Graphout does an aggregation of all the datapoints that were returned from Graphite. By default, it's average calculation, you can also use max, min. Here is an example

// Alphanumeric unique output name
"logfile":
{
    // ouput module name, Graphaut will use "require" to load the module
    "output": "./logoutput",

    // filter can be used to process only matched queries (using regular expression)
    // default: all queries are processed by the outputs.
    "filter": ".*",

    // the calculation method of the values received from Graphite
    // available methods: "avg", "min", "max"
    // default: "avg"
    "calculation": "avg"

    // "params" properties are specific to the "output" module
    "params": {
        "path": "/tmp/logoutput.log"
    }
}

Also note that, Graphout uses the maxDataPoints API option, to return 60 consolidated data points at most (if you have at least Graphite 0.9.13)

So to sum it up the Graphout results you see are expected.

Hi @shamil
Thanks for your response!
Since I am passing the graphite render API from="-1min" until="now" shouldn't that only be one data point (my metric frequency in Graphite is also 1min) making the graphout avg only be of a single data point? I also thought it could be a timing issue with the -1min bucket not being filled yet and tried from="-10min", which should do the avg of 10 data points as you mention, and saw no change in behavior.

I also just did an average of all the raw graphite datapoints posted earlier and it is 22.57, while graphout was predictably 25.5xxx. I think Graphout is somehow manipulating the data incorrectly

Hi @dutchiechris

You are right, I had a bug in the Graphite query generation, the from parameter was not used properly.
I've committed a fix and pushed to NPM version 0.3.3.

Let me know if this fixed your issue.
Thanks for reporting...

I assume the latest version fixed your issue. Please re-open if otherwise.