Set only one value
brancz opened this issue · 5 comments
pprof expects the same amount of values as there are SampleType
s. The agent currently sets one value for each timestamp and then appends to SampleType (with a different sample type and unit than the original one).
I understand this is done because a timestamp can occur multiple times (even if the probability is very low), and if there are multiple occurrences then the value is >1.
I would propose that instead of optimizing for the rare case, the value is set to the aggregate, as it is typical in pprof, and if the same timestamp occurs multiple times, then it just occurs multiple times in the timestamps array. This solution also has the added benefit that it would make the embedded profile actually pprof compatible (as in if interpreted by the pprof toolchain it will actually work as expected).
There is actually a difference in parca's check vs. this agents OTel Profile integration. In the former, the length of each Sample.value
needs to be the exact length of Profile.SampleType
. In the later (this agent), the sum of all Sample.value
in a Profile
equals the length of Profile.SampleType
. Consequently, each message Profile
of this agent holds n
messages of type Sample
in Profile.sample
. This is not an optimization for the timestamps and should not be mixed.
My first impression is, that there is no simple answer.
len(Sample.Value)
can only be len(Profile.SampleType)
if message Profile
holds exactly only a single message of type Sample
in Profile.Sample
. If there has to be a dedicated message Profile
for each message Sample
, then Profile.StringTable
can not be shared between different samples, which will lead to a noticeable overhead.
Sounds to me like we should solve this at the protocol level by adding end indexes into the timestamps array per SampleType. If len(timestamps) > 0
then len(timestampEnds)
must equal len(profile.SampleType)
.
Hmm, I'm not sure that I get the problem that you are describing here @florianl. Perhaps I'm misunderstanding something, but I parsed @brancz proposal in the OP as:
- Always put single entry into
profile.SampleType
- Always put single value of
1
intosample.Value
- If actual count is higher than 1 within a single timestamp, repeat the timestamp
count
times
This seems like a pretty reasonable suggestion to me. It's incredibly rare to have more than one sample per nanosecond timestamp. This may be inefficient for other impls of the proto that routinely have values that are a lot higher than 1 (i.e. if one would store "CPU cycles spent" instead of a sample counter), but at least for our agent it should be fine?
I was thinking about the tie between Sample.Value
and Sample.TimestampsUnixNano
as a one to one connection. And so on multi-core systems with a high parallel workload I thought this would conflict.
If actual count is higher than 1 within a single timestamp, repeat the timestamp count times
Reading message Sample
and Sample.TimestampsUnixNano
again, it looks to me as this is fine with the specification. I will update #77 accordingly.
I actually failed to make this connection, but yes the value would already be the counter that tells us which of the timestamps belong to which sample type. It doesn’t need a protocol change. I like it!