The transactional outbox pattern allows to update the state of a database and to send a message to a message broker in one single transaction. In this way, either both or none of these actions happen. In other words: they are executed atomically. Details of this pattern can be found here: https://microservices.io/patterns/data/transactional-outbox.html.
There are numerous approaches to implement this pattern. In the following, we discuss those approaches that are most reasonable to me.
Polling Publisher
Using a Scheduled Database Poller
Polling the database in a regular time intervall is the straight-forward implementation of the “Polling Publisher”.
Scheduled Database Poller with Java and Spring:
@Scheduled
void pollDatabase() {
Optional<OutboxEvent> event = this.repository.findFirstByTimestampByOrderAsc();
event.ifPresent(this.eventService::send);
}
EventService with Java and Spring:
@Transactional
@Retryable(..)
public void send(OutboxEvent event) {
this.kafkaTemplate.send(..);
this.repository.deleteById(event.getId());
}
Attention must be paid to a multi-instance operation. If multiple instances share the same database, they also share the same outbox table. As a consequence, they interfere with each other when processing outbox entries.
Advantages: This approach is easy to implement and requires no special infrastructure.
Disadvantages: The scheduler performs polling and thus stresses the database unnecessarily. Moreover, running multiple instances requires a more complex implementation to keep the order of events.
Requirements: Apparently, this approach requires a database table to store the outbox events. Moreover, it requires a poller – either executed by a thread or by a dedicated application.
What about using an in-memory queue as optimization to avoid polling?
Unfortunately, this approach either looses events or falls back to polling depending on whether the notification is passed after or within the transaction.
Using a Workflow Engine
Bernd Ruecker proposes to use a workflow engine in order to implement the transactional outbox pattern. The corresponding process model is illustrated in the following figure.
The underlying workflow engine remembers the position in the process model, so that it does not execute tasks again which were already executed completely. In this way, the workflow engine resumes with the task that has not been finished and should be processed next according to the process model.
Hence, If the application crashes before completing the database operation task, the process instance resumes at this point and re-executes the database operation task. If the application crashes before completing the notification task, the process instance resumes at this point and re-executes the notification task.
Strictly speaking, this approach does not represent the pattern anymore, because it does not make use of an outbox table. Nonetheless, it puts the pattern’s underlying idea into practice.
Advantages: This approach requires neither a dedicated outbox table, nor special infrastructure or a custom scheduler. Moreover, we can make use of the monitoring and operations capabilities of the workflow engine tooling to debug pending tasks.
Disadvantages: The integrated job scheduler of the workflow engine still performs polling. Moreover, this approach does not keep the order of events.
Requirements: Apparently, this approach requires a workflow engine. If the engine and the process model should be hidden from the application developer, a transactional outbox API is recommended that encapsulates these implementation details.
Why does this approach work?
The alternative approaches remember the intent to send an event by storing an entry in the outbox table. In contrast, this approach remembers the intent by storing the current position and the event’s data in the process instance. Of course, this data is also represented by database tables. However, these tables are managed automatically by the workflow engine, and not by the application developer.
Transaction Log Tailing
TODO no polling