Duplicate TimeSeries errors from opentelemetry Collector
nielm opened this issue · 1 comments
nielm commented
The following errors are being reported from the OpenTelemetry collector in GKE decoupled mode
textPayload: "2024-03-19T09:47:21.215Z error exporterhelper/common.go:95
Exporting failed. Dropping data.
{
"kind": "exporter",
"data_type": "metrics",
"name": "googlecloud",
"error": "rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: Field timeSeries[6] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.; Field timeSeries[8] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.; Field timeSeries[7] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.; Field timeSeries[5] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.; Field timeSeries[9] had an invalid value: Duplicate TimeSeries encountered. Only one point can be written per TimeSeries per request.\nerror details: name = Unknown desc = total_point_count:10 success_point_count:5 errors:{status:{code:3} point_count:5}",
"dropped_items": 10}"
nielm commented
Some analysis later...
The scaler instances were occasionally sending metrics to the OpenTelemetry Collector more frequently than the batching interval of the collector.
The collector does not aggregate these metrics when batching so was sending multiple Scaler metrics from the same pod in the same CreateTimeSeries request.
Solution:
- In decoupled Scaler:
- Do not manually flush the metrics
- In OTEL mode, ensure that the periodic export interval is greater than the batching interval of the OpenTelemetry collector