[FEATURE] Expose Pulsar-Client Metrics with Prometheus
nlu90 opened this issue · 11 comments
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
@nlu90, checking in on this; when we last spoke about this, you had indicated June as a potential month for this release. Do you still feel confident that is the case? Please let me know if there is anything we can do to help.
I spoke with Neng and the work on this will be delayed.
@mboyanna-sn @nlu90 can you please advise if this is still something that will be done given the change we talked about?
@martijngonlag for any feature requests like this, going forward please create issue in a feature-request repository instead.
@martijngonlag aah I see this is engineering lead feature that's why it's here - for engineering to track it down. I think the right thing to do was for it to have a counterpart in feature-request (@sara-hannigan could you help us get organized here to ensure there's the counterpart in feature-request repo).
@nlu90 given this is in pulsar-spark repo did you refer to this feature in engineering OKRs as the Spark metrics tracking?
@martijngonlag You can answer Yes to the customer, this is part of Q3, I just confirmed with Neng.
@mboyanna-sn Checking in to see if this is still scheduled for Q3
Key metrics I am interested in seeing (some are directly pulsar and some are more application oriented)
- How big a backlog there is (delta between the current "pointer" and the "head" of the topic)
- Any systemic failures e.g. errors connecting to Pulsar or the pointer no longer refers to a position in the topic (e.g. due to retention policy which is too low and when we recreated the subscription the data was garbage collected)
- Error counts / retries when processing
@frankjkelly Thanks for your comments. I was looking for a way to expose these metrics.
FYI we have ended our usage of spark with Pulsar so this is no longer a priority for us. Thanks!