syepes/VSphere2Metrics

No metrics in series (influxdb)

pez252 opened this issue · 4 comments

I am running into an issue getting metrics to appear in Grafana, and am fairly sure the issue is between VSphere2Metrics and InfluxDB.

From Grafana I can select a series which did not exist in the newly created DB so the traffic IS getting to influx DB and it's creating the series. Here are some snippets of logs etc from my troubleshooting.

19-05-2016 - 13:13:48.971 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Connected to vSphere (https://esx.DOMAIN.TLD/sdk) in 0.082 seconds
19-05-2016 - 13:13:49.014 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Getting ESXi Hosts from: ha-folder-root
19-05-2016 - 13:13:49.017 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Found 1 Host Systems
19-05-2016 - 13:13:49.066 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Getting VMs from: ha-folder-root
19-05-2016 - 13:13:49.069 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Found 11 Virtual Machines
19-05-2016 - 13:13:49.306 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Collected Host metrics in 0.237 seconds
19-05-2016 - 13:13:49.951 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Collected Guest metrics in 0.644 seconds
19-05-2016 - 13:13:49.957 [scripthost] 4585:[esx] ERROR com.allthingsmonitoring.vmware.VSphere2Metrics - getEvantsInfluxDB: VI SDK invoke exception:com.vmware.vim25.NotImplemented
19-05-2016 - 13:13:50.441 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Finished Building InfluxDB Metrics in 0.484 seconds
19-05-2016 - 13:13:50.536 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.utils.MetricClient - Ping InfluxDB: OK
19-05-2016 - 13:13:50.536 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.utils.MetricClient - Sending 10 InfluxDB Metrics
19-05-2016 - 13:13:50.939 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.utils.MetricClient - Finished sending 10 Metrics (mBuffer: 0) to InfluxDB in 0.407 seconds
19-05-2016 - 13:13:50.943 [scripthost] 4585:[esx] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Disconected from vSphere in 0.004 seconds
19-05-2016 - 13:13:50.946 [scripthost] 4585:[main] INFO  com.allthingsmonitoring.vmware.VSphere2Metrics - Finished Collecting and Sending vSphere Metrics in 2.062 seconds

image

image

The config

vcs.urls                   = ['https://esx.DOMAIN.TLD/sdk']
vcs.user                   = 'readonly'
vcs.pwd                    = 'ENCRYPTED PASSWORD'                // Generated password
vcs.timezone               = 'GMT-5'
vcs.perfquery_timeout      = 60                // Metrics retrieval timeout in Seconds
vcs.perf_max_samples       = 15                // Last 1 Minute (3x20s = 60s)
                                               // Last 2 Minute (6x20s = 120s)
                                               // Last 5 Minute (15x20s = 300s)
                                               // Last 10 Minute (30x20s = 600s)

destination.type          = 'InfluxDB'         // 'Graphite', 'InfluxDB', 'Both'
destination.timezone      = 'America/NewYork'

// Graphite Settings
graphite.host             = 'influxdb.DOMAIN.TLD'
graphite.port             = 2003               // 2004 (pickle), 2003 (standard)
graphite.mode             = ''           // pickle, null
graphite.prefix           = 'ESXi'             // Graphite Metric prefix

// InfluxDB Settings
influxdb.host             = 'influxdb.DOMAIN.TLD'
influxdb.port             = 8086
influxdb.protocol         = 'http' // 'http', 'http-compression'
influxdb.auth             = 'root:PASSWORD'        // 'user:passwd'
influxdb.database         = 'ESXi'
influxdb.retentionPolicy  = null               // Specific retentionPolicy 'name' o null to use the default database one

I tried monitoring the traffic using ngrep -d eth0 -W byline and I see the data being sent

net_packetsTx_summation-number,host=esx,instance=vmnic1,server=influx,type=Guest value=23.0 1463549940
net_packetsTx_summation-number,host=esx,instance=vmnic1,server=influx,type=Guest value=3114.0 1463549960
net_packetsTx_summation-number,host=esx,instance=vmnic1,server=influx,type=Guest value=24.0 1463549980

If I copy, paste some of those lines into the "Write Data" dialog box in the InfluxDB admin interface it successfuly writes the metrics. If I send a few lines of the data with curl -i -XPOST 'http://influxdb.DOMAIN.TLD:8086/write?db=ESXi' --data-binary it successfully writes the metrics.

Any thoughts on where the VSphere2Metics -> InfluxDB break down may be?

Thanks!

Hi,

I'm seeing the same thing, I think.. I havne't looked at traffic using ngrep. But I have an influxdb and graphite hosts. I have other metrics being sent successfully to other influx DBs on the same host I have my vsphere2metrics DB. I also have other metrics being sent successfully to my graphite DB.

However If I point vsphere2metrics at either influx or graphite I see the series/whisper files are created, but I don't get any stats added. e.g. for graphite

python whisper-fetch.py --from=$Tneg15 --pretty /opt/graphite/storage/whisper/ESXi/esxi-01/Host/cpu/8/utilization_average-percent.wsp
Fri May 20 13:10:00 2016 None
Fri May 20 13:11:00 2016 None
Fri May 20 13:12:00 2016 None
Fri May 20 13:13:00 2016 None
Fri May 20 13:14:00 2016 None
Fri May 20 13:15:00 2016 None
Fri May 20 13:16:00 2016 None
Fri May 20 13:17:00 2016 None
Fri May 20 13:18:00 2016 None
Fri May 20 13:19:00 2016 None

I am also having the same issue as #7 .

Pez, I now have metrics flooding in..

In the config.groovy I had
vcs.timezone = 'GMT-5'
destination.timezone = 'Europe/Paris'

(as my VC server is in EST, and the vsphere2metrics and influxDB in Europe)

using ./VSphere2Metrics -dm and http://www.epochconverter.com/
I noticed the unix time stamp trying to be written to influxDB were in the future.. Changing the vcs.timezone to GMT-0 and re running the ./VSphere2Metrics -dm, the timestamps were correct.

They are correct in grafana too in the UK and US, as it converts the timestamp based on local time for the browser.

I see you have vcs.timezone and destination.timezone in EST/EDT, but perhaps destination.timezone is not being correctly recognised. (my setup should really be London which is an hour different to Paris!)

Hope this helps Tim

(I still have #7 to deal with..)

You're correct! It's always nice when it ends up being something simple...

I have not seen #7 yet... Only running against a test host with local storage right now. We'll see what happens when I pull from production.

Thanks for the assistance Tim, and thanks for the project syepes.

Hello @pez252 and @timdicon,

Sorry for the delay I have just seen you issue, I have to many github watches ;-)
The best way to actually debug/resolve this issue is to use the "-dm" option as you found and make sure that the outputed epoch timestamps match your current TZ using (vcs.timezone, destination.timezone)

I created this option as we had some VC that were configured with different TZ's