New to Netdata? Here is a live demo: http://my-netdata.io
Netdata is a system for distributed real-time performance and health monitoring.
It provides unparalleled insights, in real-time, of everything happening on the systems it runs (including containers and applications such as web and database servers), using modern interactive web dashboards.
Netdata is fast and efficient, designed to permanently run on all systems (physical & virtual servers, containers, IoT devices), without disrupting their core function.
Netdata currently runs on Linux, FreeBSD, and MacOS.
Netdata is a monitoring agent you install on all your systems.
It is:
- a metrics collector - for system and application metrics (including web servers, databases, containers, etc)
- a time-series database - all stored in memory (does not touch the disks while it runs)
- a metrics visualizer - super fast, interactive, modern, optimized for anomaly detection
- an alarms notification engine - an advanced watchdog for detecting performance and availability issues
All packaged together in a very flexible, extremely modular, distributed application.
This is how netdata compares to other monitoring solutions:
netdata | others (open-source and commercial) |
---|---|
High resolution metrics (1s granularity) | Low resolution metrics (10s granularity at best) |
Monitors everything, thousands of metrics per node | Monitor just a few metrics |
UI is super fast, optimized for anomaly detection | UI is good for just an abstract view |
Meaningful presentation for all metrics (educational) | You have to know the metrics before you start |
Install and get results immediately | A long preparation is required to get any useful results |
Use it to troubleshooting performance problems | Use them to get statistics of past performance |
Kills the console for tracing performance issues | The console is required for troubleshooting |
Requires zero dedicated resources | Require dedicated resources |
Netdata is free, super fast, very easy, completely open, flexible and integrate-able. It has been designed by SysAdmins, DevOps and Developers for troubleshooting performance problems, not just visualizing metrics.
WARNING:
People get adicted to netdata!
Once you install it and use it for a few minutes, there is no going back! You have been warned...
You can quickly install netdata on a Linux server with the following:
# make sure you run `bash` for your shell
bash
# install netdata, directly from github source
bash <(curl -Ss https://my-netdata.io/kickstart.sh)
More installation methods can be found at the installation page.
Since May 16th 2016 (the date the global public netdata registry was released):
Nov 6th, 2018
- netdata v1.11.0 released!
-
New query engine, supporting statistical functions.
-
Fixed security issues identified by Red4Sec.com and Synacktiv.
-
New Data Collection Modules:
rethinkdbs
,proxysql
,litespeed
,uwsgi
,unbound
,powerdns
,dockerd
,puppet
,logind
,adaptec_raid
,megacli
,spigotmc
,boinc
,w1sensor
,monit
,linux_power_supplies
. -
Improved Data Collection Modules:
statsd.plugin
,apps.plugin
,freeipmi.plugin
,proc.plugin
,diskspace.plugin
,freebsd.plugin
,python.d.plugin
,web_log
,nginx_plus
,ipfs
,fail2ban
,ceph
,elasticsearch
,nginx_plus
,redis
,
beanstalk
,mysql
,varnish
,couchdb
,phpfpm
,apache
,icecast
,mongodb
,postgress
,elasticsearch
,mdstat
,openvpn_log
,snmp
,nut
. -
Added alarms for detecting abnormally high load average,
TCP
SYN
andTCP
accept queue overflows, network interfaces congestion and alarms forbcache
,mdstat
,apcupsd
,mysql
. -
system alarms are now enabled on FreeBSD.
-
New notification methods: rocket.chat, Microsoft Teams, syslog, fleep.io, Amazon SNS.
-
and dozens more improvements, enhancements, features and compatibility fixes
Sep 18, 2018
- netdata has its own organization
Netdata used to be a firehol.org project, accessible as firehol/netdata
.
Netdata now has its own github organization netdata
, so all github URLs are now netdata/netdata
. The old github URLs, repo clones, forks, etc redirect automatically to the new repo.
Jun 16, 2018
- netdata in CNCF
Netdata is now at the Cloud Native Computing Foundation (CNCF) landscape.
Read the netdata presentation we gave at CNCF TOC on Sep 18, 2018.
This is a high level overview of netdata feature set and architecture.
Click it to to interact with it (it has direct links to documentation).
-
Stunning interactive bootstrap dashboards
mouse and touch friendly, in 2 themes: dark, light -
Amazingly fast
responds to all queries in less than 0.5 ms per metric,
even on low-end hardware -
Highly efficient
collects thousands of metrics per server per second,
with just 1% CPU utilization of a single core, a few MB of RAM and no disk I/O at all -
Sophisticated alerting
hundreds of alarms, out of the box!
supports dynamic thresholds, hysteresis, alarm templates,
multiple role-based notification methods (such as email, slack.com, flock.com,
pushover.net, pushbullet.com, telegram.org, twilio.com, messagebird.com, kavenegar.com) -
Extensible
you can monitor anything you can get a metric for,
using its Plugin API (anything can be a netdata plugin,
BASH, python, perl, node.js, java, Go, ruby, etc) -
Embeddable
it can run anywhere a Linux kernel runs (even IoT)
and its charts can be embedded on your web pages too -
Customizable
custom dashboards can be built using simple HTML (no javascript necessary) -
Zero configuration
auto-detects everything, it can collect up to 5000 metrics
per server out of the box -
Zero dependencies
it is even its own web server, for its static web files and its web API -
Zero maintenance
you just run it, it does the rest -
scales to infinity
requiring minimal central resources -
several operating modes
autonomous host monitoring, headless data collector, forwarding proxy, store and forward proxy, central multi-host monitoring, in all possible configurations.
Each node may have different metrics retention policy and run with or without health monitoring. -
time-series back-ends supported
can archive its metrics ongraphite
,opentsdb
,prometheus
, json document DBs, in the same or lower detail
(lower: to prevent it from congesting these servers due to the amount of data collected)
netdata collects several thousands of metrics per device.
All these metrics are collected and visualized in real-time.
Almost all metrics are auto-detected, without any configuration.
This is a list of what it currently monitors:
-
CPU
usage, interrupts, softirqs, frequency, total and per core, CPU states -
Memory
RAM, swap and kernel memory usage, KSM (Kernel Samepage Merging), NUMA -
Disks
per disk: I/O, operations, backlog, utilization, space, software RAID (md) -
Network interfaces
per interface: bandwidth, packets, errors, drops -
IPv4 networking
bandwidth, packets, errors, fragments,
tcp: connections, packets, errors, handshake,
udp: packets, errors,
broadcast: bandwidth, packets,
multicast: bandwidth, packets -
IPv6 networking
bandwidth, packets, errors, fragments, ECT,
udp: packets, errors,
udplite: packets, errors,
broadcast: bandwidth,
multicast: bandwidth, packets,
icmp: messages, errors, echos, router, neighbor, MLDv2, group membership,
break down by type -
Interprocess Communication - IPC
such as semaphores and semaphores arrays -
netfilter / iptables Linux firewall
connections, connection tracker events, errors -
Linux DDoS protection
SYNPROXY metrics -
fping latencies
for any number of hosts, showing latency, packets and packet loss -
Processes
running, blocked, forks, active -
Entropy
random numbers pool, using in cryptography -
NFS file servers and clients
NFS v2, v3, v4: I/O, cache, read ahead, RPC calls -
Network QoS
the only tool that visualizes networktc
classes in realtime -
Linux Control Groups
containers: systemd, lxc, docker -
Applications
by grouping the process tree and reporting CPU, memory, disk reads,
disk writes, swap, threads, pipes, sockets - per group -
Users and User Groups resource usage
by summarizing the process tree per user and group,
reporting: CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets -
Apache and lighttpd web servers
mod-status
(v2.2, v2.4) and cache log statistics, for multiple servers -
Nginx web servers
stub-status
, for multiple servers -
Tomcat
accesses, threads, free memory, volume -
web server log files
extracting in real-time, web server performance metrics and applying several health checks -
mySQL databases
multiple servers, each showing: bandwidth, queries/s, handlers, locks, issues,
tmp operations, connections, binlog metrics, threads, innodb metrics, and more -
Postgres databases
multiple servers, each showing: per database statistics (connections, tuples
read - written - returned, transactions, locks), backend processes, indexes,
tables, write ahead, background writer and more -
Redis databases
multiple servers, each showing: operations, hit rate, memory, keys, clients, slaves -
couchdb
reads/writes, request methods, status codes, tasks, replication, per-db, etc -
mongodb
operations, clients, transactions, cursors, connections, asserts, locks, etc -
memcached databases
multiple servers, each showing: bandwidth, connections, items -
elasticsearch
search and index performance, latency, timings, cluster statistics, threads statistics, etc -
ISC Bind name servers
multiple servers, each showing: clients, requests, queries, updates, failures and several per view metrics -
NSD name servers
queries, zones, protocols, query types, transfers, etc. -
PowerDNS
queries, answers, cache, latency, etc. -
Postfix email servers
message queue (entries, size) -
exim email servers
message queue (emails queued) -
Dovecot POP3/IMAP servers
-
ISC dhcpd
pools utilization, leases, etc. -
IPFS
bandwidth, peers -
Squid proxy servers
multiple servers, each showing: clients bandwidth and requests, servers bandwidth and requests -
HAproxy
bandwidth, sessions, backends, etc -
varnish
threads, sessions, hits, objects, backends, etc -
OpenVPN
status per tunnel -
Hardware sensors
lm_sensors
andIPMI
: temperature, voltage, fans, power, humidity -
NUT and APC UPSes
load, charge, battery voltage, temperature, utility metrics, output metrics -
PHP-FPM
multiple instances, each reporting connections, requests, performance -
hddtemp
disk temperatures -
smartd
disk S.M.A.R.T. values -
SNMP devices
can be monitored too (although you will need to configure these) -
chrony
frequencies, offsets, delays, etc. -
beanstalkd
global and per tube monitoring -
ceph
OSD usage, Pool usage, number of objects, etc.
And you can extend it, by writing plugins that collect data from any source, using any computer language.
Use our automatic installer to build and install it on your system.
It should run on any Linux system (including IoT). It has been tested on:
- Alpine
- Arch Linux
- CentOS
- Debian
- Fedora
- Gentoo
- openSUSE
- PLD Linux
- RedHat Enterprise Linux
- SUSE
- Ubuntu
Check the netdata wiki.
netdata is GPLv3+.
Netdata re-distributes other open-source tools and libraries. Please check the third party licenses.