/0l-monitoring

Monitoring tools for https://0l.network

The UnlicenseUnlicense

0l-monitoring

Monitoring tools for https://0l.network

This repository provides guides for both monitoring providers and node operators.

  • Monitoring providers [MPs]: any party willing to provide monitoring services for 0L node operators by running monitoring tools such as Prometheus stack.
  • 0L node operators [OPs]: any party running any type of 0L nodes (validator/VFN or fullnode) who want to minitor their nodes.

Prometheus Stack

Prometheus is an open source application which can scrap the real-time metrics to monitor events and also do real-time alerting.

Grafana is an analytical and visualization tool which is helpful to create interactive charts & graphs from the data and alerts scraped from the monitoring tools.

0L diem node exports set of Prometheus metrics that we would like to collect and use to build Grafana dashboards. These are exported on ports 9101 and 9012. In addition to diem metrics, node operators can choose to expose system metrics like CPU, memory, storage, and others using Prometheus Node Exporter .

Guides on how to set up Prometheus and Grafana instances can be found here:

As for node operators they can follow the steps below to allow monitoring providers to collect metrics from their hosts.

Modifications to 0L hosts [OPs]

  • Pick your monitoring provider from the list below

  • Open ports 9100-9101 to $PROMETHEUS_STATIC_IP (and probably to your own IP as well)

    Depending on your host and firewall, you might need to enable that on different places; ufw, Digital Ocean Firewall, AWS Security Groups, etc.

  • Install Node Exporter This assumes you are running Ubuntu

    sudo apt update
    sudo apt install prometheus-node-exporter

    or use manual setup

  • Confirm these endpoints are working

    • curl http://YOUR-IP:9100/metrics
    • curl http://YOUR-IP:9101/metircs
  • Share your validator account address, host IP(s), and a Discord handle with the monitoring provider

Grafana Dashboards

Example dashboards from Bᴺ 𝕊pace.


Monitoring Providers

  • 1. Bᴺ 𝕊pace

    Prometheus
    Static IP: 85.215.101.127
    
    Grafana
    Url      : https://grafana.openlibra.space
    Auth     : `viewer:viewer` (view only)
    

    Discord: @nourspace#6652


Todo

  • Add specific todos for Prometheus and Grafana setup guides
  • Consider using K8s operators and/or Helm charts to run Prometheus stack
    • Use HTTPs and load balancers
  • Link to and/or integrate other monitoring tools built by the 0L community
    • Enable alerting on Grafana dashboards

Legacy

Some tasks and question from the Hackmd document that need to be integrated in the current todos.

https://hackmd.io/9dxv7ZwYS1yOmBVSjSV2wg

Questions (old)

  • Security: We want to create our own node-exporter config to only send meaningful and safe system metrics.

  • Decentralization: We are running the two instances on our own for now, but thinking how to move this forward where there is no single point of failure neither a single entity hosting everyone's metrics.

Todo (old)