open-telemetry/opentelemetry-ebpf-profiler

Set only one value

brancz opened this issue · 5 comments

pprof expects the same amount of values as there are SampleTypes. The agent currently sets one value for each timestamp and then appends to SampleType (with a different sample type and unit than the original one).

I understand this is done because a timestamp can occur multiple times (even if the probability is very low), and if there are multiple occurrences then the value is >1.

I would propose that instead of optimizing for the rare case, the value is set to the aggregate, as it is typical in pprof, and if the same timestamp occurs multiple times, then it just occurs multiple times in the timestamps array. This solution also has the added benefit that it would make the embedded profile actually pprof compatible (as in if interpreted by the pprof toolchain it will actually work as expected).

There is actually a difference in parca's check vs. this agents OTel Profile integration. In the former, the length of each Sample.value needs to be the exact length of Profile.SampleType. In the later (this agent), the sum of all Sample.value in a Profile equals the length of Profile.SampleType. Consequently, each message Profile of this agent holds n messages of type Sample in Profile.sample. This is not an optimization for the timestamps and should not be mixed.

My first impression is, that there is no simple answer.

len(Sample.Value) can only be len(Profile.SampleType) if message Profile holds exactly only a single message of type Sample in Profile.Sample. If there has to be a dedicated message Profile for each message Sample, then Profile.StringTable can not be shared between different samples, which will lead to a noticeable overhead.

Sounds to me like we should solve this at the protocol level by adding end indexes into the timestamps array per SampleType. If len(timestamps) > 0 then len(timestampEnds) must equal len(profile.SampleType).

Hmm, I'm not sure that I get the problem that you are describing here @florianl. Perhaps I'm misunderstanding something, but I parsed @brancz proposal in the OP as:

  • Always put single entry into profile.SampleType
  • Always put single value of 1 into sample.Value
  • If actual count is higher than 1 within a single timestamp, repeat the timestamp count times

This seems like a pretty reasonable suggestion to me. It's incredibly rare to have more than one sample per nanosecond timestamp. This may be inefficient for other impls of the proto that routinely have values that are a lot higher than 1 (i.e. if one would store "CPU cycles spent" instead of a sample counter), but at least for our agent it should be fine?

I was thinking about the tie between Sample.Value and Sample.TimestampsUnixNano as a one to one connection. And so on multi-core systems with a high parallel workload I thought this would conflict.

If actual count is higher than 1 within a single timestamp, repeat the timestamp count times

Reading message Sample and Sample.TimestampsUnixNano again, it looks to me as this is fine with the specification. I will update #77 accordingly.

I actually failed to make this connection, but yes the value would already be the counter that tells us which of the timestamps belong to which sample type. It doesn’t need a protocol change. I like it!