An open-source cloud-native monitoring system that is all-in-one
Out-of-the-box, it integrates data collection, visualization, and monitoring alert
We recommend upgrading your Prometheus + AlertManager + Grafana combination to Nightingale!
- Out-of-the-box
- Supports multiple deployment methods such as Docker, Helm Chart, and cloud services, integrates data collection, monitoring, and alerting into one system, and comes with various monitoring dashboards, quick views, and alert rule templates. It greatly reduces the construction cost, learning cost, and usage cost of cloud-native monitoring systems.
- Professional Alerting
- Provides visual alert configuration and management, supports various alert rules, offers the ability to configure silence and subscription rules, supports multiple alert delivery channels, and has features such as alert self-healing and event management.
- Cloud-Native
- Quickly builds an enterprise-level cloud-native monitoring system through a turnkey approach, supports multiple collectors such as Categraf, Telegraf, and Grafana-agent, supports multiple data sources such as Prometheus, VictoriaMetrics, M3DB, ElasticSearch, and Jaeger, and is compatible with importing Grafana dashboards. It seamlessly integrates with the cloud-native ecosystem.
- High Performance and High Availability
- Due to the multi-data-source management engine of Nightingale and its excellent architecture design, and utilizing a high-performance time-series database, it can handle data collection, storage, and alert analysis scenarios with billions of time-series data, saving a lot of costs.
- Nightingale components can be horizontally scaled with no single point of failure. It has been deployed in thousands of enterprises and tested in harsh production practices. Many leading Internet companies have used Nightingale for cluster machines with hundreds of nodes, processing billions of time-series data.
- Flexible Extension and Centralized Management
- Nightingale can be deployed on a 1-core 1G cloud host, deployed in a cluster of hundreds of machines, or run in Kubernetes. Time-series databases, alert engines, and other components can also be decentralized to various data centers and regions, balancing edge deployment with centralized management. It solves the problem of data fragmentation and lack of unified views.
If you are using Prometheus and have one or more of the following requirement scenarios, it is recommended that you upgrade to Nightingale:
- Multiple systems such as Prometheus, Alertmanager, Grafana, etc. are fragmented and lack a unified view and cannot be used out of the box;
- The way to manage Prometheus and Alertmanager by modifying configuration files has a big learning curve and is difficult to collaborate;
- Too much data to scale-up your Prometheus cluster;
- Multiple Prometheus clusters running in production environments, which faced high management and usage costs;
If you are using Zabbix and have the following scenarios, it is recommended that you upgrade to Nightingale:
- Monitoring too much data and wanting a better scalable solution;
- A high learning curve and a desire for better efficiency of collaborative use in a multi-person, multi-team model;
- Microservice and cloud-native architectures with variable monitoring data lifecycles and high monitoring data dimension bases, which are not easily adaptable to the Zabbix data model;
If you are using open-falcon, we recommend you to upgrade to Nightingale:
- For more information about open-falcon and Nightingale, please refer to read Ten features and trends of cloud-native monitoring。
n9e-screenshots.mp4
Nightingale monitoring can receive monitoring data reported by various collectors (such as Categraf , telegraf, grafana-agent, Prometheus, etc.) and write them to various popular time-series databases (such as Prometheus, M3DB, VictoriaMetrics, Thanos, TDEngine, etc.). It provides configuration capabilities for alert rules, silence rules, and subscription rules, as well as the ability to view monitoring data. It also provides automatic alarm self-healing mechanisms (such as automatically calling back to a webhook address or executing a script after an alarm is triggered), and the ability to store and manage historical alarm events and view them in groups.
If the performance of a standalone time-series database (such as Prometheus) has bottlenecks or poor disaster recovery, we recommend using VictoriaMetrics. The VictoriaMetrics architecture is relatively simple, has excellent performance, and is easy to deploy and maintain. The architecture diagram is as shown above. For more detailed documentation on VictoriaMetrics, please refer to its official website.
We welcome you to participate in the Nightingale open-source project and community in various ways, including but not limited to:
- Adding and improving documentation => n9e.github.io
- Sharing your best practices and experience in using Nightingale monitoring => Article sharing
- Submitting product suggestions => github issue
- Submitting code to make Nightingale monitoring faster, more stable, and easier to use => github pull request
Respecting, recognizing, and recording the work of every contributor is the first guiding principle of the Nightingale open-source community. We advocate effective questioning, which not only respects the developer's time but also contributes to the accumulation of knowledge in the entire community
- Before asking a question, please first refer to the FAQ
- We use GitHub Discussions as the communication forum. You can search and ask questions here.
- We also recommend that you join ours Slack channel to exchange experiences with other Nightingale users.
You can register your usage and share your experience by posting on Who is Using Nightingale.