abpframework/abp

Publishing Distributed Events as Transactional

gentledepp opened this issue · 10 comments

Issue: abp.io seems to create a two-phase commit distributed transaction when using distributed events

Abp.io latest source code sends messages (domainevents which should be called integrationevents in DDD I guess) when committing a transaction.

Thus, it can happen, that

  1. the EF transaction succeeds
  2. but the message bus is not reachable (for a longer time so that even using Polly does not help)

In that case, the events will never be sent!.

If I understand your code base correctly, you do send the distributed events within the EF Core transaction:

var result = await base.SaveChangesAsync(acceptAllChangesOnSuccess, cancellationToken);
await EntityChangeEventHelper.TriggerEventsAsync(changeReport);

At least, you send it to the message bus and if that fails, the transaction is also not committed.
However, isn't that exactly what causes a two phase commit?

await TriggerEventsInternalAsync(changeReport);
if (changeReport.IsEmpty() || UnitOfWorkManager.Current == null)
{
return;
}
await UnitOfWorkManager.Current.SaveChangesAsync();

Shouldn't this be avoided?

Solution: The outbox pattern

The solution is using the outbox pattern:
Instead of directly sending the events to the event bus (e.g. RabbitMQ), they are

  • serialized to the database (-> within the same transaction)
  • later consumed by a polling consumer (polling consumer pattern)

Note: There is someone who wrote a 3 series blog article about this very problem and in part 3 he even developed a sample solution which is available on github: https://github.com/mizrael/DDD-School/

Please have a look!

(Here are part 1 and part 2 for reference)

well, to be honest. I did go through the documentation and did not find any hint as to why abp implements this "everything within one transaction" approach.

If at least any one of the core team could comment on that?
Maybe we do overlook something here and it totally makes sense... kind of... 🤷‍♂️

After all: If this is not well thought through, you will end up with an inconsistent state across microservices so, in my view, this is a critical design flaw

@hikalkan could you please get us someone to comment on this? (One line should suffice - thanks!)

Hi all,

The solution is using the outbox pattern: Instead of directly sending the events to the event bus (e.g. RabbitMQ), they are serialized to the database (-> within the same transaction) later consumed by a polling consumer (polling consumer pattern)

This is what we were planning. But currently, it is not implemented.

RabbitMqDistributedEventBus implements the related work. If you want to do it now, you can create your own IDistributedEventBus implementation that adds events to a database table, then a background worker that polls from the table and publishes to rabbitmq. All the other ABP systems will work as expected.

well, integration events are somewhat the backbone of micro services. Without this implemented, abp.io can only be used for building a modular monolith for now.
But thanks for clarifying this!

https://github.com/EasyAbp/Abp.EventBus.CAP integrated the CAP with the ABP framework,

CAP is a library based on .Net standard, which is a solution to deal with distributed transactions, has the function of EventBus, it is lightweight, easy to use, and efficient.

In the process of building an SOA or MicroService system, we usually need to use the event to integrate each service. In the process, simple use of message queue does not guarantee reliability. CAP adopts local message table program integrated with the current database to solve exceptions that may occur in the process of the distributed system calling each other. It can ensure that the event messages are not lost in any case.

You can also use CAP as an EventBus. CAP provides a simpler way to implement event publishing and subscriptions. You do not need to inherit or implement any interface during subscription and sending process.

Well, I have situation that RabbitMQ-based distributed event bus publishes every second message only. I tried with UoW with requiresnew and transactional on true but nothing helps.
Is this somehow related problem?

Anyone? Can someone confirm that this issue will solve the problem that I described with distributed event bus?
Thank you.

Please consider using
https://github.com/MassTransit/MassTransit
for communication with Service Bus ( RabbitMQ, Azure Service Bus, ActiveMQ, and Amazon SQS/SNS )

I was re-checking the CAP library. However, it doesn't support multi-tenancy with separate databases (dotnetcore/CAP#699) and a single modular application with multiple databases (dotnetcore/CAP#783). It also requires to use its custom database transaction approach.

Out problem is much harder, because we have multi-tenancy (with separate db support), multiple db can be used by different modules in the same process. We also want to make it seamlessly works with the current IDistributedEventBus.

Implementing this feature with supporting the following scenarios seems complicated:

  • Multi-tenancy with separate databases
  • Multiple databases in a single application (can be a monolith modular with each module has its database) - in this case, every database should have a different queue and we should somehow determine which database to write and read for the outgoing event queue.
  • Same service runs with multiple instance, so the sending thread should handle this case and implement some kind of distributed coordination (one of the instances sends the events while others wait, if the sender instance stops one another instance takes the queue)

We should support all these scenarios. So, I postpone this to the next major version, 5.0, to work on it.

We have slightly specific requirements for Pub/Sub mechanism and I have so questions if this is possible achieve.

  • Multitenancy with separated databases for each tenant (SQL databases)

  • Our ecosystem consists of multiple standalone API applications (deployed .HttpApi.Host projects (with Swagger visualization).
    Each API application have THE SAME connection string to the host database in appsettings.json.

  • Each our Eto entity includes Guid? TenantId property.
    Based on this TenantId, each subscribed API application can use:

using (CurrentTenant.Change(tenantId))
{
    //logic
}

to select and work with correct tenant database (because of the same connection string in appsettings.json - look previous point)

  • Each standalone API application has its own schema in databases (host, tenants) and there are no foreign keys between schemas.

  • In Etos are also Id fields which are generated when saving to database with new transactions (requiresNew: true, isTransactional: true)

  • Each standalone API application inserts serialized JSON Eto object in database, so that we can see what exactly was sent.

  • Each "Subscribed" standalone API application also inserts serialized JSON Eto to its own part of database.
    (If there are problems in the future, we can analyze which Etos were sent and which were accepted inside specific "Subscriber" applications)

Is somehow possible to use Pub/Sub mechanism in our scenarios, because currently we use HttpClient calls to "simulate" Pub/Sub mechanism for distributed event bus.
The main problem here is that HttpClient should wait "subscribed" applications to finish its own work and therefore initial request from Internet gateway (BFF) can last 10sec+.

Using Background workers is not solution, because there are multiple chained-calls inside microservice architecture and if each part have 1 second to start processing, this takes a lot of time.
Whole "Pub/Sub chained-calls" are time critical and therefore waiting background worker in each standalone API application to start process newly accepted Eto is not solution for us.

Does anyone have suggestions for us how to solve Pub/Sub on our cases? @hikalkan, @maliming, @geffzhang?
Thank you.