ilovepancakes95/idrac_snmp-grafana

No data in grafana

juhler64 opened this issue · 20 comments

I have verified that idrac is set to snmp v1.

I have verified that I can see snmp data from command line.

I have verified that data is in the database.

I have verified that grafana connects to the database.

Using Grafana, Telegraf, and InfluxDB all on Ubuntu server 20.04

I have also done a query in Grafana on blank dashboard and can see data.

What am I missing?

You're going to have to be more specific. You say the issue is "No data in Grafana" but then you say you see data in Grafana when you run a query.

I am saying the dashboard (12106) that I installed is showing no data.

I can make a "blank" dashboard and query the database manually and get data to show in a graf for instance.

So if you go into one of the panels in the dashboard, take the exact query it shows, create a new panel on a blank dashboard and use the EXACT same query, it shows data on that query but not on original dashboard? If this is the case, either something was changed in the config of the dashboard or it's a problem specific to your grafana install.

I wasn't trying the EXACT query, I was following a troubleshooting guide. https://techexpert.tips/grafana/grafana-monitoring-snmp-devices/

I can try the EXACT in a blank panel. There was no changes to the config (I assume Grafana config). I just chose the Import via grafana.com and entered 12106. I have not tried the json file import.

sudo grafana-cli plugins install grafana-clock-panel
sudo grafana-cli plugins install flant-statusmap-panel

Would this have caused a problem? I couldn't find them in the plug in GUI.

Possibly, can you send a screenshot of the whole dashboard?

Send your telegraf config please. Grafana isn't picking up on your idracs. This definitely seems like an issue with your telegraf setup as long as grafana is correctly connected to telegraf/influxDB. If the dashboard is loading up like that, it isn't a problem with the dashboard.

Please paste the telegraf config here.

Telegraf Configuration

Telegraf is entirely plugin driven. All metrics are gathered from the

declared inputs, and sent to the declared outputs.

Plugins must be declared in here to be active.

To deactivate a plugin, comment out the name and any variables.

Use 'telegraf -config telegraf.conf -test' to see what metrics a config

file would generate.

Environment variables can be used anywhere in this config file, simply surround

them with ${}. For strings the variable must be within quotes (ie, "${STR_VAR}"),

for numbers and booleans they should be plain (ie, ${INT_VAR}, ${BOOL_VAR})

Global tags can be specified here in key="value" format.

[global_tags]

dc = "us-east-1" # will tag all metrics with dc=us-east-1

rack = "1a"

Environment variables can be used as tags, and throughout the config file

user = "$USER"

Configuration for telegraf agent

[agent]

Default data collection interval for all inputs

interval = "10s"

Rounds collection interval to 'interval'

ie, if interval="10s" then always collect on :00, :10, :20, etc.

round_interval = true

Telegraf will send metrics to outputs in batches of at most

metric_batch_size metrics.

This controls the size of writes that Telegraf sends to output plugins.

metric_batch_size = 1000

Maximum number of unwritten metrics per output. Increasing this value

allows for longer periods of output downtime without dropping metrics at the

cost of higher maximum memory usage.

metric_buffer_limit = 10000

Collection jitter is used to jitter the collection by a random amount.

Each plugin will sleep for a random time within jitter before collecting.

This can be used to avoid many plugins querying things like sysfs at the

same time, which can have a measurable effect on the system.

collection_jitter = "0s"

Default flushing interval for all outputs. Maximum flush_interval will be

flush_interval + flush_jitter

flush_interval = "10s"

By default or when set to "0s", precision will be set to the same

timestamp order as the collection interval, with the maximum being 1s.

ie, when interval = "10s", precision will be "1s"

when interval = "250ms", precision will be "1ms"

Precision will NOT be used for service inputs. It is up to each individual

service input to set the timestamp at the appropriate precision.

Valid time units are "ns", "us" (or "µs"), "ms", "s".

precision = ""

Override default hostname, if empty use os.Hostname()

hostname = ""

If set to true, do no set the "host" tag in the telegraf agent.

omit_hostname = false

###############################################################################

OUTPUT PLUGINS

###############################################################################

Configuration for sending metrics to InfluxDB

[[outputs.influxdb]]
urls = ["http://127.0.0.1:8086"]
database = "monitoring"
username = "mon"
password = "supersecret"

[[processors.regex]]
[[processors.regex.fields]]
key = "log-dates"
pattern = "^(?P\d{4})(?P\d{2})(?P

\d{2})(?P\d{2})(?P\d{2})(?P\d{2})\.(?P\d{6})(?P[-+]\d{3,4})$"
replacement = "${YYYY}-${MM}-${DD} ${HH}:${mm}:${ss}"

[[inputs.snmp]]
agents = [ "10.0.0.95" ]
version = 1
community = "public"
name = "idrac-hosts"

[[inputs.snmp.field]]
name = "system-name"
oid = ".1.3.6.1.2.1.1.5.0"
is_tag = true

[[inputs.snmp.field]]
name = "system-osname"
oid = ".1.3.6.1.4.1.674.10892.5.1.3.6.0"

[[inputs.snmp.field]]
name = "system-osversion"
oid = ".1.3.6.1.4.1.674.10892.5.1.3.14.0"

[[inputs.snmp.field]]
name = "system-model"
oid = ".1.3.6.1.4.1.674.10892.5.1.3.12.0"

[[inputs.snmp.field]]
name = "idrac-url"
oid = ".1.3.6.1.4.1.674.10892.5.1.1.6.0"

[[inputs.snmp.field]]
name = "power-state"
oid = ".1.3.6.1.4.1.674.10892.5.2.4.0"

[[inputs.snmp.field]]
name = "system-uptime"
oid = ".1.3.6.1.4.1.674.10892.5.2.5.0"

[[inputs.snmp.field]]
name = "system-servicetag"
oid = ".1.3.6.1.4.1.674.10892.5.1.3.2.0"

[[inputs.snmp.field]]
name = "system-globalstatus"
oid = ".1.3.6.1.4.1.674.10892.5.2.1.0"

[[inputs.snmp.table]]
name = "idrac-hosts"
inherit_tags = [ "system-name" , "disks-name" ]

[[inputs.snmp.table.field]]
   name = "bios-version"
   oid = ".1.3.6.1.4.1.674.10892.5.4.300.50.1.8"

[[inputs.snmp.table.field]]
   name = "raid-batterystate"
   oid = ".1.3.6.1.4.1.674.10892.5.5.1.20.130.15.1.4"

[[inputs.snmp.table.field]]
   name = "intrusion-sensor"
   oid = ".1.3.6.1.4.1.674.10892.5.4.300.70.1.6"

[[inputs.snmp.table.field]]
   name = "disks-mediatype"
   oid = ".1.3.6.1.4.1.674.10892.5.5.1.20.130.4.1.35"

[[inputs.snmp.table.field]]
   name = "disks-state"
   oid = ".1.3.6.1.4.1.674.10892.5.5.1.20.130.4.1.4"

[[inputs.snmp.table.field]]
   name = "disks-predictivefail"
   oid = ".1.3.6.1.4.1.674.10892.5.5.1.20.130.4.1.31"

[[inputs.snmp.table.field]]
   name = "disks-capacity"
   oid = ".1.3.6.1.4.1.674.10892.5.5.1.20.130.4.1.11"

[[inputs.snmp.table.field]]
   name = "disks-name"
   oid = ".1.3.6.1.4.1.674.10892.5.5.1.20.130.4.1.2"
   is_tag = true

[[inputs.snmp.table.field]]
   name = "memory-status"
   oid = ".1.3.6.1.4.1.674.10892.5.4.200.10.1.27"

[[inputs.snmp.table.field]]
   name = "storage-status"
   oid = ".1.3.6.1.4.1.674.10892.5.2.3"

[[inputs.snmp.table.field]]
   name = "temp-status"
   oid = ".1.3.6.1.4.1.674.10892.5.4.200.10.1.63"

[[inputs.snmp.table.field]]
   name = "psu-status"
   oid = ".1.3.6.1.4.1.674.10892.5.4.200.10.1.9"

[[inputs.snmp.table.field]]
   name = "log-dates"
   oid = ".1.3.6.1.4.1.674.10892.5.4.300.40.1.8"

[[inputs.snmp.table.field]]
   name = "log-entry"
   oid = ".1.3.6.1.4.1.674.10892.5.4.300.40.1.5"

[[inputs.snmp.table.field]]
   name = "log-severity"
   oid = ".1.3.6.1.4.1.674.10892.5.4.300.40.1.7"

[[inputs.snmp.table.field]]
   name = "log-number"
   oid = ".1.3.6.1.4.1.674.10892.5.4.300.40.1.2"
   is_tag = true

[[inputs.snmp.table.field]]
   name = "nic-name"
   oid = ".1.3.6.1.4.1.674.10892.5.4.1100.90.1.30"
   is_tag = true

[[inputs.snmp.table.field]]
   name = "nic-vendor"
   oid = ".1.3.6.1.4.1.674.10892.5.4.1100.90.1.7"

[[inputs.snmp.table.field]]
   name = "nic-status"
   oid = ".1.3.6.1.4.1.674.10892.5.4.1100.90.1.4"

[[inputs.snmp.table.field]]
   name = "nic-current_mac"
   oid = ".1.3.6.1.4.1.674.10892.5.4.1100.90.1.15"
   conversion = "hwaddr"

[[inputs.snmp.field]]
name = "fan1-speed"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.12.1.6.1.1"

[[inputs.snmp.field]]
name = "fan2-speed"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.12.1.6.1.2"

[[inputs.snmp.field]]
name = "fan3-speed"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.12.1.6.1.3"

[[inputs.snmp.field]]
name = "fan4-speed"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.12.1.6.1.4"

[[inputs.snmp.field]]
name = "fan5-speed"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.12.1.6.1.5"

[[inputs.snmp.field]]
name = "fan6-speed"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.12.1.6.1.6"

[[inputs.snmp.field]]
name = "inlet-temp"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.20.1.6.1.1"

[[inputs.snmp.field]]
name = "exhaust-temp"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.20.1.6.1.2"

[[inputs.snmp.field]]
name = "cpu1-temp"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.20.1.6.1.3"

[[inputs.snmp.field]]
name = "cpu2-temp"
oid = ".1.3.6.1.4.1.674.10892.5.4.700.20.1.6.1.4"

[[inputs.snmp.field]]
name = "cmos-batterystate"
oid = ".1.3.6.1.4.1.674.10892.5.4.600.50.1.6.1.1"

[[inputs.snmp.field]]
name = "system-watts"
oid = ".1.3.6.1.4.1.674.10892.5.4.600.30.1.6.1.3"

Looks like you're missing the port number after the IP address for your idrac. Mine was 161 as the default so use that or whatever you may have set it to after your IP address in the config. Without that being there and without data showing up in this dashboard, not sure how you "have verified that data is in the database." because no port number would mean no data coming into telegraf/influx at all from idrac. But, add the correct port number and retest.

I added the port. No go. Here is what I meant about getting data from database:
grafanacmos2
grafanacmos

This has gotta be something wrong with your grafana install or the importing of the dashboard because on mine, the query that shows for cmos battery is exactly the same, yet dashboard works fine for me. And it's weird that when you set the query yourself in a new dash it works. Something in grafana on your end (maybe the way the variables are set or imported) is messed up. What grafana version are you on?

I am sure it is something like that.

I am using v7.5.2

I'm on 7.4.3. Really not sure what else to try. Did you try re-importing the dashboard and starting from scratch? I'll leave this issue open for now in case anyone else has any ideas.

If you go the idrac variable settings, do you see your idrac hosts on the bottom in the preview section. See screenshot (apologies but I’m on mobile).

Okay I found this.

Not getting any data. Going through it all now.

Got this working but I had to remove the hosts wildcard and enter actual host name in each metric.

Strange. Something is wrong I suppose with the way the variable is being pulled into the query then, but not sure what since according to your posts above, it appears the query is constructed the same way mine is. Either way, glad you at least got data to display.