This repo is for monitoring scripts by telegraf and others to keep an eye on servers. This howto and scripts assume you have a running grafana and influxDB locally or remotely on a different server. Then, this setting will provide data to them.
Before adding Influx repository, run this so that apt will be able to read the repository.
sudo apt-get update && sudo apt-get install apt-transport-https
Add the InfluxData key
wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/os-release
test $VERSION_ID = "7" && echo "deb https://repos.influxdata.com/debian wheezy stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
test $VERSION_ID = "8" && echo "deb https://repos.influxdata.com/debian jessie stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
test $VERSION_ID = "9" && echo "deb https://repos.influxdata.com/debian stretch stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
test $VERSION_ID = "10" && echo "deb https://repos.influxdata.com/debian buster stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
According to your system setup, you have multiple choices to do this (but only one works at a time usually :))
sudo systemctl enable telegraf.service
sudo update-rc.d telegraf defaults
Edit the provided telegraf_client.conf
and telegraf_server.conf
file by setting up the necessary details (and credentials) to grafana and influxDB.
If you first want to see how many stuffs can be enabled, then check first the default telegraf.conf
file at /etc/telegraf/telegraf.conf
The files do not containt a heavy amount of comments like the sample files for telegraf but what you have to change for server
is the following:
dc = "i4" # your datacenter id, if no datacenter, use, for instance, your department code
rack = "levi-rack" # define a rack ID here. If you monitor everything within a rack, just pick a random name
...
hostname = "banana" # this is important! This is how you can identify the monitored data coming from the server itself!
urls = ["http://127.0.0.1:8086"] # accordingly, set influxDB access properly
...
[[inputs.net]]
interfaces = ["eth0.101", "eth0.102", "br0", "eth0"] #use your interface you want to monitor
Note, my server monitoring server is a BananaPI, so there is no HDD/SSD to monitor.
Now, let's see the config file for the clients:
dc = "i4" # your datacenter id, if no datacenter, use, for instance, your department code
rack = "levi-rack" # define a rack ID here. If you monitor everything within a rack, just pick a random name
...
hostname = "server1" # this is important! This is how you can identify the monitored data coming from the server itself!
urls = ["http://192.168.1.1:8086"] # set the IP of the server
...
Update the rest of the telegraf configs according to your specific needs
Then, copy the config file to /etc/telegraf/
directory.
$ sudo cp telegraf_server.conf /etc/telegraf/telegraf.conf
$ sudo cp telegraf_client.conf /etc/telegraf/telegraf.conf
Sometimes, telegraf
wants to read a config file from /etc/default/telegraf.conf
, so it is safe to create a symlink there pointing to our configuration:
$ sudo ln -s /etc/telegraf/telegraf.conf /etc/default/telegraf.conf
sudo systemctl start telegraf
sudo systemctl status telegraf
sudo service telegraf restart
or if you have telegraf at /etc/init.d
, you can do the good-old way:
sudo /etc/init.d/telegraf restrt
sudo service telegraf status
You have to install lm-sensors
.
To enable HDDtemp, we need to install hddtemp
and run it as daemon.
sudo apt-get install hddtemp
Edit the configuration file /etc/default/hddtemp
, and set
RUN_DAEMON="true"
Restart hddtemp
sudo /etc/init.d/hddtemp restart
No worries, it was not working for me either :D Drive is many time in sleeping mode, but even if awaken the report 30C is not reflected in grafana...it sometimes showed 3-5C :(
Let's use the script in this repo by using telegraf's inputs.exec
.
To do, let's see how the provided hddtemp.sh
works.
It does nothing special, but by using sudo
and hddtemp <DEV_ID>
, it can get the necessary data!
Then, it is parsed in a way (via echo
) that influxDB understands.
However, user telegraf
has to have sudo
permission to read hddtemp. For this, we allow user telegraf
in the sudoers file.
In particular, we allow user telegraf
to call hddtemp
without password, and this is the only command telegraf
user can actually use.
Add the following line to /etc/sudoers
telegraf ALL= NOPASSWD: /usr/sbin/hddtemp
Now, add/uncomment the following lines in telegraf.conf
assuming your hostname is nidoking
and the drive you want to monitor is /dev/sda
(set these variables according to your needs):
[[inputs.exec]]
commands = [
"sh /home/user/monitoring/hddtemp.sh /dev/sda nidoking"
]
timeout = "5s"
data_format = "influx"
If you have a DHT22 sensor and you can wire it to your monitoring device, e.g., to the GPIO ports of the Raspberry pi, then you
we can use lol_dht22
submodule in this repo to read those values.
In order checkout the submodule, issue the following command:
git submodule update --init --recursive
This submodule is based on the wiringPI C library, so first we have to install it. Doing it on Raspberry PI is straightforward and finding materials and instruction via google is simple. In case you want to use it in a BananaPi Pro/R1/etc. (like we do), then please refer to this site
Then, configure and compile the library as usual:
./configure
make
sudo make install
If everything works fine, then the following command should do the job
sudo ./loldht 2
Raspberry Pi wiringPi DHT22 reader
www.lolware.net
Humidity = 44.60 % Temperature = 36.00 *C
in case you also wired the data to the second GPIO port.
In case your DHT22 module is connected to the monitoring system itself (not the reporting servers), modify the telegraf_server.conf
by adding the input.exec
blocks as in case of the HDDtemp above.
# # Read metrics from one or more commands that can output to stdout
[[inputs.exec]]
commands = [
"sh /home/user/monitoring/racktemp.sh"
]
timeout = "5s"
data_format = "influx"