Memory Limiter does not obey documented processor behaviour when used in multiple pipelines
Opened this issue · 0 comments
Describe the bug
Recently my team ran into an issue where the memory_limiter
processor did not behave as expected when it was referenced in multiple pipelines. We believe this is because it does not actually follow the documented behaviour for processors in this situation.
Specifically, as per this documentation, when the same processor is referenced in multiple pipelines, each pipeline gets its own independent copy of that processor -- the processor is not "shared" across pipelines.
The same name of the processor can be referenced in the processors key of multiple pipelines. In this case, the same configuration is used for each of these processors, but each pipeline always gets its own instance of the processor. Each of these processors has its own state, and the processors are never shared between pipelines. For example, if batch processor is used in several pipelines, each pipeline has its own batch processor.
Based on this, when we referenced the same memory_limiter
processor in multiple pipelines (eg. A and B), we expected:
- The
memory_limiter
processor in pipeline A would only examine the memory used by pipeline A, and same for B. - If pipeline A's memory usage went above the limit defined in the
memory_limiter
processor, only pipeline A would be halted -- pipeline B would not be impacted.
However, what we actually saw was that, as soon as pipeline A's memory usage breached the limit defined in the memory_limiter
processor, both pipelines A and B were halted. This suggests that, contrary to what the documentation says should be the case, the memory_limiter
processor is "shared" across pipelines -- i.e. the memory_limiter
processor examines + limits the total memory usage of all the pipelines, not just its own individual pipeline.
Also, this issue comment also implies that the memory_limiter
does not obey the documented processor behaviour:
I also still confused why it is a processor :) you can only define once per instance right? it applies to all pipelines (or the least defined one wins for all pipelines)
Steps to reproduce
Configure an agent with 1 memory_limiter
processor that is referenced in two different pipelines: A logs pipeline and a metrics pipeline. Generate a large volume of logs so that only the memory used by the logs pipeline increases until it breaches the limit defined in the memory_limiter
processor.
What did you expect to see?
Since only the logs pipeline's memory usage went over the limit defined in the memory_limiter
processor, only the logs pipeline should get halted. The metrics pipeline should be unaffected and should continue sending metrics.
What did you see instead?
Both the logs pipeline and metrics pipeline got halted, even though only the logs pipeline's memory usage was too high. We know the metrics pipeline got halted by the memory_limiter
processor because, in the agent logs, the metrics pipeline generated an error message that said "data refused due to high memory usage" which is also present in the code for the memory_limiter
processor.
What version did you use?
OTEL agent v0.102.1
Environment
Kubernetes
Suggested Solution
- One solution is to refactor the
memory_limiter
processor so it obeys the documented processor behaviour. However, I don't know how difficult this would be. I'm also not sure if the wider community would want its current behaviour to change. - As I said, perhaps the current behaviour of the
memory_limiter
processor is actually desired. If that is the case, then the real "bug" here is just that its behaviour in this situation is not clearly documented. Regardless of whether or not we implement solution 1, I think we should at least document this anomalous behaviour of thememory_limiter
processor so that it avoids such confusion in the future.
Specifically:
2a. In the general processor documentation, clearly note that thememory_limiter
processor is an exception and behaves differently
2b. Thememory_limiter
processor documentation currently only mentions its behaviour when referenced in a single pipeline. Instead, clearly document (with examples) how it behaves when referenced in multiple pipelines. In particular it should call out that it does not behave as per the current processor documentation.
Workaround?
Assuming the memory_limiter
processor's current behaviour is not changed any time soon, is there any suggested workaround to get our desired behaviour (where, if the logs pipeline's memory usage goes too high, only the logs pipeline is halted)?
Eg. Would defining two different memory_limiter
processors (eg. memory_limiter/logs
and memory_limiter/metrics
), such that each pipeline gets a different memory_limiter
processor, solve this issue? Or would it not make a difference since, regardless of what they're named, they will both be examining the total memory usage of all the pipelines, not just their own pipeline, which means they will always be impacted by the other pipeline?