This project implements streaming telemetry based performance monitoring (PM) using NETCONF notifications (RFC 5277). Conventional "all-you-can-eat" streaming telemetry (since 2014) has seen increasing adoption and is steadily replacing SNMP-based monitoring. Streaming telemetry is now available on several networking platforms (routing, optical transport) which uses gRPC as the streaming wire protocol. Open interface definitions such as gNMI introduce gRPC service definitions that can be used across different vendor implementations of streaming telemetry to standardize a common set of telemetry operations.
In this project, we showcase:
- A practical extension called threshold-based streaming telemetry which extends conventional streaming telemetry. In gRPC/gNMI telemetry, clients can specify (as part of subscription creation) a
sample_interval
which indicates the frequency at which the network element should emit the PM data.- One of the issues with this mechanism, is that the streaming frequency is fixed regardless of the underlying system state.
- With threshold-based streaming telemetry, the network element (NE) adaptively changes the rate of streaming (stream faster or slower) depending upon the (contextually) parameter being monitored. For example, in our demonstration, we automatically adapt the streaming frequency of the optical NE controller's CPU utilization PM parameter, based on the system load average variance.
- We use NETCONF as the streaming wire protocol as opposed to gRPC
- NETCONF is widely used as a configuration protocol and is supported on most networking platforms (routing to optical transport).
- NETCONF also supports asynchronous notifications (RFC 5277) which pre-dates gRPC. Clients can use the
<create-subscription>
NETCONF native RPC to subscribe to event streams of interest. The NE can support one or more streams on which data (PM, alarms, events) are published. - NETCONF always operates over SSH which provides security (similar to TLS in case of gRPC).
This project uses Cisco/Tail-F ConfD as the NETCONF stack to implement the threshold-based streaming notifications. We use OpenConfig YANG models (with extensions) to represent the PM data. Specifically, we stream the following optical NE controller's operating system level PM parameters:
- CPU Utilization reference)
- Memory Utilization (reference)
- System Load Averages (YANG extension - OpenConfig doesn't support load averages)
- Per-process Statistics (reference)
Threshold based streaming of each of the above PM parameters is independently performed by a separate OS process (on the optical NE). The ConfD stack runs as a separate set of processes implementing the NETCONF protocol. Using ConfD's IPC mechanism (libconfd.so
), each of the PM streaming processes connect to ConfD, and write data to the NETCONF operational data store (as well as publish to a dedicated NETCONF notification stream) which is then available to the telemetry clients. The ConfD stack and the streaming processes all run on the optical NE's operating system.
This project also has Python ncclient based NETCONF client implementation to subscribe to the threshold PM telemetry data from the optical NE.
This project was done as a collaboration between Infinera and Oracle Cloud Infrastructure (OCI), as part of OFC 2020 Demo Zone:
- Abhinava Sadasivarao, Sharfuddin Syed, Deepak Panda, Paulo Gomes, Rajan Rao, Jonathan Buset, Loukas Paraschis, Jag Brar, and Kannan Raj, "Demonstration of Extensible Threshold-based Streaming Telemetry for Open DWDM Analytics and Verification", Optical Fiber Conference (OFC) 2020 (DOI).
- The demonstration was done on Infinera's XT-3300 optical transponder. The NETCONF threshold based streaming applications were run as "software agents" on the host XT-3300 NE operating system.
- The poster provides additional details on the actual demonstration setup, including the data collection and visualization (using Prometheus and Grafana).
- To download and install ConfD, please visit this page.
- Although the link above refers to
confd-basic
, we have tested our streaming agents with ConfD Premium and it works just as well.
- Although the link above refers to
- ConfD requires OpenSSL's libcrypto, specifically,
libcrypto.so.1.0.0
. Newer versions of libcrypto may not be (historically) compatible with ConfD. Please refer to the ConfD user guide for details.- Installation of libcrypto is out-of-scope of this guide. Refer to your operating system (preferably Linux) distribution for details.
- If using Linux, one could use popular distributions such as Debian to obtain the
libcrypto.so.1.0.0
library.
Abhinava Sadasivarao, (c) Infinera Corporation, 2020