question: idiomatic to use one topic?
Closed this issue · 8 comments
hello,
i'm quite new to camunda and bpmn, so please for please forgive if this sounds stupid. is it idiomatic, recommended, or even necessary to have one (kafka) topic for all the messages flowing through the system? i'm hoping that this show-case repo uses only one in order to keep the complexity low. but i would expect dedicated topics between the stages.
one topic for all doesn't seem to me be good option from a scalability and performance perspective. having each listening party on the event-bus to read, evaluate, and optionally discard a message is costly. the more stages, the more event-/command-messages there will be, and the more events each component/listener needs to consume/process.
at least this is how i explain, why i can see a big difference in processing time of each order when placing 1 order vs 1000 orders at a time. (admitted, the later use-case is not frequent, but can regularly arise when the "order" application would have a temporary outage for some time and would need to catch up after being brought up again.)
i've extended the code a little bit by measuring the duration of the examples individual stages: https://github.com/xitep/flowing-retail/commits/pete/customization
i'm based on a few commits behind your current master
but there were no significant changes i believe. you can run the applications via mvn -f <app> exec:java
, follow the "order" application's log to see the timings per order and then place for example 1000 orders using curl
:
curl -X PUT 'http://localhost:8090/api/cart/order-many?customerId=123&numOrders=1000'
maybe i'm misinterpreting something, but i appears that payment, fetching and shipping goods get more costly the more orders there are in the pipeline which i would not expect.
many thanks for clarifications in advance.
is it idiomatic, recommended, or even necessary to have one (kafka) topic for all the messages flowing through the system? i'm hoping that this show-case repo uses only one in order to keep the complexity low. but i would expect dedicated topics between the stages.
You guessed right, the main reason for this decision was to keep the complexity low.
Thanks for sharing additional information on the performance and timing, I will have a look at it.
Could you elaborate what you mean by "appears that payment, fetching and shipping goods get more costly the more orders there are in the pipeline which i would not expect"?
I started 1 order:
Order processed: received: 5ms / process-kickoff: 2ms / payment: 18ms / goods: 16ms / ship: 10ms / total: 49ms
And in a later run:
Order processed: received: 6ms / process-kickoff: 1ms / payment: 24ms / goods: 11ms / ship: 11ms / total: 52ms
I started 10 orders and basically all of them look like:
Order processed: received: 153ms / process-kickoff: 1ms / payment: 148ms / goods: 141ms / ship: 99ms / total: 541ms
And in a later run:
Order processed: received: 48ms / process-kickoff: 2ms / payment: 133ms / goods: 118ms / ship: 99ms / total: 398ms
I started 1000 orders and basically all of them look like:
Order processed: received: 21.987ms / process-kickoff: 2ms / payment: 32.041ms / goods: 23.654ms / ship: 15.546ms / total: 93.228ms
And in a later run:
Order processed: received: 14.094ms / process-kickoff: 6ms / payment: 22.530ms / goods: 14.160ms / ship: 11.043ms / total: 61.827ms
So nothing I can really read from this - only that my Windows machine got much slower after the recent Windows update (but that I already knew from the news ;-)). But as I run all components on one machine and Camunda using an in-flight H2 database - it is hard to make any assumptions on real behavior under load anyway. And that everything slows down a bit if all components are competing for my two processor cores isn't surprising.
Or what did you exactly see?
many thanks for your response. i can see numbers similar to the ones you posted. my interpretation is two fold: 1) either the messaging overhead i mentioned above or 2) the database access to manage the orders' state is slow and magnifies the problem when the load is higher. i yet have to profile and try to set up the example with multiple topics to see the effect.
Yep, some profiling is necessary. And if you want to get a better feeling I would also switch to an external database (e.g. PostgreSQL) on a separate host. Depending on the target numbers you want to hit it might even make sense to move services to different hosts or at least span own instances for every service (e.g. with Kubernetes in a cloud environment). Good question is also why 10 instances are slower than 1 or 1000.
there's one observation i could make: one of the troubles with the demo is that the "order" application serves as a central hub and is responsible for dispatching all messages. but it uses only one consumer to process all the events. in the case where we place 1000 orders into the topic in a batch, it will first have to process all of these 1000 messages before it can even start seeing other events from the system to be dispatched. this can be nicely magnified by placing an artificial "sleep" into the MessageListener#orderPlacedReceived
method.
Here's an attempt to decouple the processing of the "OrderPlacedEvents" and the rest of the events on the bus by introducing another consumer. I didn't know how to do it the "spring-boot" way, so just hacked it manually. maybe you can fix that more idiomatically. An excerpt from the orders.log with an artificial 1s delay with 100 orders placed:
original-version (without fix; notice payment times):
Order processed: received: 1,016ms / process-kickoff: 3ms / payment: 100,606ms / goods: 1,724ms / ship: 796ms / total: 104,145ms
Order processed: received: 2,035ms / process-kickoff: 4ms / payment: 99,597ms / goods: 1,719ms / ship: 795ms / total: 104,150ms
Order processed: received: 3,065ms / process-kickoff: 2ms / payment: 98,581ms / goods: 1,714ms / ship: 794ms / total: 104,156ms
Order processed: received: 4,087ms / process-kickoff: 2ms / payment: 97,571ms / goods: 1,709ms / ship: 794ms / total: 104,163ms
Order processed: received: 5,116ms / process-kickoff: 3ms / payment: 96,552ms / goods: 1,706ms / ship: 792ms / total: 104,169ms
Order processed: received: 6,137ms / process-kickoff: 2ms / payment: 95,542ms / goods: 1,702ms / ship: 793ms / total: 104,176ms
Order processed: received: 7,155ms / process-kickoff: 3ms / payment: 94,535ms / goods: 1,698ms / ship: 792ms / total: 104,183ms
Order processed: received: 8,173ms / process-kickoff: 3ms / payment: 93,526ms / goods: 1,694ms / ship: 792ms / total: 104,188ms
Order processed: received: 9,194ms / process-kickoff: 3ms / payment: 92,516ms / goods: 1,690ms / ship: 791ms / total: 104,194ms
Order processed: received: 10,209ms / process-kickoff: 6ms / payment: 91,509ms / goods: 1,686ms / ship: 790ms / total: 104,200ms
...
new version:
Order processed: received: 1014ms / process-kickoff: 2ms / payment: 18ms / goods: 17ms / ship: 18ms / total: 1069ms
Order processed: received: 2032ms / process-kickoff: 3ms / payment: 13ms / goods: 16ms / ship: 21ms / total: 2085ms
Order processed: received: 3047ms / process-kickoff: 1ms / payment: 12ms / goods: 19ms / ship: 12ms / total: 3091ms
Order processed: received: 4059ms / process-kickoff: 3ms / payment: 14ms / goods: 12ms / ship: 11ms / total: 4099ms
...
actually, i'm convinced that using the event bus with everybody listening on it is no good - at least from the performance and scalability perspective. it unnecessarily consumes cpu, bandwidth, and may delay the processing of messages, thus, exploiting less of the potential possible concurrency in the overall system. however, i found your article and was left in the impression that you actually are advocating a central event bus the way it is implemented in this demo - maybe i misunderstood - but that leaves me confused with your earlier statement in this ticket that it's done for the sake of simplicity. may i ask you to shed some more light on this for me?
many thanks in advance.
Hi xitep.
Nice finding - thanks for the feedback and the explanation - that makes sense. And I think it is OK to introduce another consumer in this case. I am actually not a Kafka expert (most customers in this area actually use RabbitMQ or the like, where you definitely have separate topics/queues) so I am not the right guy to comment on best practices for Kafka topics. I do see advantages and disadvantages of having multiple topics, so there might be not the one and only way to go, as it always depends (e.g. on load you expect, performance requirements, ...).
Conceptually it is important in the event-driven way, that the event producer does not know its consumers - it has to be the other way round. So the important thing is that topics must be defined by the producing party. In that sense I do not see problems with defining a "flowing-retail-order-events" or "flowing-retail-payment-events" topic. You would basically need something similar for commands, which is more like a queue, but also make a Kafka topic. With Kafka I would still do a central event bus (to not bother with endpoints) - but having multiple topics. This is not a contradiction from my point of view.
Does this make sense?
Maybe I should switch the example to use AMQP and Rabbit - there is is pretty much clear what to do :-)
yeah ... makes sense. thank you!
Hi Xitep.
There is an awesome blog post from Martin Kleppman about this topic available: https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html.
Cheers
Bernd