net performance tests and reporting
set of nodes participate in frequent and not overlapping set of traffic exchange.
iperf3 is used for server-client pairs. Tests are categorized into:
- tcp based; test end after specific volume of data is transferred. Regardless of the time it takes.
(note_to_self: kill lagged/queued processes; not allowing them to stack in the low transfer times (due to connection problems or capacity caps)) - udp based; test end after specific time-window ends. Regardless of the volume of data transferred.
- both test results are reported into whisper db; presented in grafana dashboard. One mark/point per reported test results.
Nodes are installed per DC. Nodes are predefined with a specific set of packages. Package requirements are minimal (for the nodes participating in the tests):
- puppet/ansible
- mtr
- iperf3
- iptables
- python
For the node aggregating and presenting results:
- grafana
- graphite
- whisper
Reporting test results
Automatic nodes deployment
Reporting standard and changed paths
Fixing stalled tcp tests
Idea: Automatic alerting of not satisfactory results *
*when a number of consecutive tests fall below certain threshold an alert is triggered.
Yet to be checked if test nodes should keep track of results and report based on that.
Alternative is basing this on grafana alerting feature (challenging to tackle consecutive low values vs a single one-off)