Integration of the Functional Geometric Monitoring (FGM) method in the Apache Flink platform
Functional Geometric Monitoring is a technique that can be applied to any monitoring problem in order to perform distributed and scalable monitoring with minimal communication cost.The FGM method is a method that is independent of the monitoring problem, to achieve this the method uses a problem-specific family of functions termed safe functions.Finally, the FGM method can be naturally adapted under adverse conditions of the monitoring problem such as very tight monitoring bounds and the presence of skew in the distribution of data among the distributed nodes.
The project structure was organized as follow:
- Worker
- Worker logic
- Worker structure
- Coordinator
- Coordinator logic
- Coordinator structure
- State
- Coordinator state
- Worker State
- Datatypes
- Vector
- Sketches
- Count-Min
- AGMS
- SafeZone
- Job
- JobIteration
- JobKafka
The two main components of the architecture are the Workers and the Coordinator. Each of these operators is a KeyedCoProcess operator. In particular, the Workers operator has two inputs, the first refers to the Input source and the second to the Feedback source that contains the control messages from the Coordinator. Respectively the Coordinator has two inputs, the first refers to the control messages from the Coordinator, and the second to the user-posed queries(Query source). Each of these operators has a side-output. The side-outputs in this case act as a logging mechanism that records metrics/information about the system such as the communication cost, throughput, latency, and back-pressure. Regarding the implementation of the feedback loop, there are two approaches.The first implementation uses as feedback loop a Kafka topic that acts as buffer between the Workers and the Coordinator.The second implementation uses as feedback loop the build-in operator of Iterative stream.
Depending on the monitoring problem all you need to do is configure the Vector and SafeZone class.