mailgun/kafka-pixy

question: overhead for remote proxy in case of sync producing

Closed this issue · 2 comments

Hi,
I understood that normally, e.g. for performance reasons, the kafka-pixy is a client-side proxy deployed on the same host as the event / message emitting application.
However, since we intend to configure for at-least-once delivery, and since this implies synchronous producing, is it still a significant overhead to host the proxy on the server side (i.e. make it a reverse proxy, horizontally scalable in its own layer), and the apps directly connecting to the gRPC on the server/kafka side? That would offload some work on our clients' side.
Thanks,
Nicu

First of all at-least-once delivery and synchronous production have nothing to do with the location of Kafka-Pixy (client host or dedicated host). These property are achieved by Kafka-Pixy configuration and API parameters.

You can run Kafka-Pixy on dedicated hosts if you want, but you have to keep in mind that the Kafka-Pixy API is single message centric and therefore you should expect higher latencies and smaller throughput. By how much... I don't really know. I have never run such tests. So If you want to know for sure then you need to run those tests yourself. And if you do, I would appreciate you sharing your results :).

When running Kafka-Pixy on dedicated hosts please also keep in mind that in terms of Kafka each Kafka-Pixy is a group member, so each Kafka-Pixy gets a share of topic partitions for exclusive consumption. Therefore you need to choose proper number of Kafka-Pixy hosts to handle your load.

Feel free to reopen if you have follow up questions.