[NEW CASE STUDY] N Brown Achieves Event-Driven Partner Integrations with Zilla Plus

The Top 10 Anti-Patterns to Avoid Inside Event-Driven Architectures

EDAs produce flexible systems composed of many independent components. However, events can be difficult to scale, debug, and organize.
James Walker
# The Top 10 Anti-Patterns to Avoid inside Event-Driven Architectures Event-driven software systems connect independent components by publishing and subscribing to events. When an event occurs, all its subscribers are notified. They're able to run actions in response, asynchronously and autonomously. Event-driven architectures help with decoupling by reducing the public interface between services to the publication and consumption of events, improving your service's flexibility and scalability. When a new component is added, it can receive information from the rest of the system by subscribing to relevant events. However, building robust event-driven systems can be challenging. There are many pitfalls that can compromise resiliency, impede scalability, or make it harder to debug unexpected behavior. In this article, you'll learn how to identify ten common anti-patterns you should avoid. ## 1. Not Having a Consistent Event Structure An event is a collection of data fields that identify a specific action taken at a certain time. Different kinds of events can possess unique characteristics, though, as you can see in the following example of two events published by a fictional payments system: - **Basket cleared:** The event is only useful if it includes the target basket's ID. - **Payment received:** You'll want to track the order ID and the payment value. These events appear to be two distinct entities, and it can be tempting to model them as such: ```json // Basket cleared { "id": 1, "basket_id": 100, "timestamp": "2023-01-23 12:00:00" } // Payment received { "id": 1, "order_id": 100, "payment_amount": 1000, "timestamp": "2023-01-23 12:00:00" } ``` However, this produces an inconsistent event structure, which can make it harder to compare different events because each one is treated as a first-class object in your project. Standardizing your event structure improves the consistency of how events are modeled and stored. You can achieve standardization by using the same basic structure for all your events and adding any event-specific data to a generic `attributes` field: ```json { "id": 1, "type": "basket_cleared", "attributes": { "basket_id": 100 }, "timestamp": "2023-01-23 12:00:00" } { "id": 2, "type": "payment_received", "attributes": { "order_id": 100, "payment_amount": 1000 }, "timestamp": "2023-01-23 12:00:00" } ``` Now all events are considered equal at the persistence level, permitting them to be stored in one consistent repository. This makes it easier to replicate events in distributed environments and ensures all events possess critical attributes such as a timestamp and unique ID. An event's internal representation doesn't need to match how it's constructed in your codebase. You can still write dedicated classes for important events that represent its data, while implementing an interface that satisfies your standard structure: ```php interface Event { public function getEventAttributes() : array; public function getEventType() : string; public function getEventTimestamp() : \DateTimeImmutable; } final class PaymentReceivedEvent implements Event { public function __construct( public readonly int $OrderId, public readonly int $PaymentAmount, public readonly Timestamp=new \DateTimeImmutable()) {} public function getEventAttributes() : array { return [ "order_id" => $this -> OrderId, "payment_amount" => $this -> PaymentAmount ]; } public function getEventType() : string { return "payment_received"; } public function getEventTimestamp() : \DateTimeImmutable { return $this -> Timestamp; } } ``` This example demonstrates how recording a payment can be made more accessible to PHP code. However, it doesn't compromise the event's basic structure, ensuring that stored events remain language-agnostic, readily serializable, and portable to other services: ```php public function storeEvent(Event $Event) : void { // Saves to database insertToDatabase( table: "event_store", data: [ "type" => $Event -> getEventType(), "attributes" => $Event -> getEventAttributes(), "timestamp" => $Event -> getEventTimestamp() -> getTimestamp() ] ); } ```
## 2. Not Using Proper Event Types Establishing a consistent event data structure leads to the next anti-pattern: using too many or too few event types causes ambiguity and reduces your event-based system's value. Events should be tied to actions derived from your system's business functions. For example, an `OrderCreated` event is a valid type because it's a primary requirement of the system. It represents a successful user experience and may be the starting point for several automated processes, such as generating a confirmation email and starting shipping procedures. Event systems become unwieldy when they contain events for every possible state change. For example, an `OrderSaved` event with attributes for the customer's email address, phone number, and billing address will be easier to set up and maintain than separate `OrderEmailAddressSaved`, `OrderPhoneNumberSaved`, and `OrderBillingAddressSaved` events. The newly combined event still accurately encapsulates the information that's relevant to the user flow. ## 3. Not Using Event Sourcing [Event sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) uses events as a system's source of truth. All state changes are recorded as events that are persisted throughout the system's lifecycle. This provides several benefits: - You can revert to a previous version of the state by discarding events stored after the target time or code revision. - You can reproduce problems in development environments by replaying the event sequence stored in production. - The system is self-documenting, with the event log recording how the state has evolved over time. Without event sourcing, it's difficult to analyze how a system arrived at a specific state. Selectively storing a subset of events impedes overall visibility because you can't retrieve the context in which the event occurred. By permanently storing every event, you've got the flexibility to jump back to any previous state and repeat its evolution into the next one. ## 4. Not Having Proper Event Validation Your event types should validate their attributes to prevent simple mistakes. Event schemas that lack validation leave you susceptible to developer typos and programming type mismatches that could cause unpredictable behavior in your system. Because events are often published from components written in several different languages, these issues tend to creep in over time. You could find that a service written in a dynamically typed language is unintentionally serializing event attributes as strings, for example. This would cause failures in components that subscribe to the events but expect the attribute to be an integer: ```json { "event_type": "payment_received", "attributes": { "payment_amount": "1000" // <- this is a string } } { "event_type": "payment_received", "attributes": { "payment_amount": 1000 // <- this is an integer } } ``` ## 5. Incorrect Event Versioning It's likely that your event schemas will change over time. At the beginning, your `payment_received` event might only need to track the amount that was tendered: ```json { "event_type": "payment_received", "attributes": { "payment_amount": 1000 } } ``` After a few months, you might have a business requirement to start accepting payments by cash as well as by card. Now your events need an additional attribute: ```json { "event_type": "payment_received", "attributes": { "payment_amount": 1000, "payment_method": "card" } } ``` Events should be versioned so they can accommodate change. This ensures you can replay old events that use a different schema than more recent activity. If each event comes with a version code, subscribers will know which schema to use when accessing attributes: ```json { "event_type": "payment_received", "event_version": 1, "attributes": { "payment_amount": 1000 } } { "event_type": "payment_received", "event_version": 2, "attributes": { "payment_amount": 1000, "payment_method": "card" } } ``` ## 6. Not Using Event Replay Replays are one of the most powerful features of event-based systems. You can use them to quickly replicate environments for troubleshooting and testing. Because events capture every state change in the system, you can republish them to force the state to rebuild itself from a particular point—for example, when you're setting up a new environment from scratch or if you have to repeat actions that were lost after a database backup was restored. If you don't replay events, it can be harder to debug problems. You'll need to manually replicate scenarios that have already been recorded for you. ## 7. Lacking a Complete Event Logging System Events should be logged to enhance visibility into your system's actions. Although event sourcing produces a log of business actions that have occurred, this alone doesn't always reveal the full context around each event. More comprehensive event logging will also record when subscribers attach to event streams, receive an event, and act upon it. You should scaffold your debugging infrastructure around these information sources to help you efficiently debug the effects of events on your system. ## 8. Inadequate Event Monitoring Events that aren't monitored prevent you from understanding the system's performance, and this can mask issues until users start complaining. You should integrate your event system into your performance metrics so you can measure event volumes and assess trends over time. This lets you react to spikes in demand, which will exhibit a corresponding jump in event publications, and detect when components fail by looking for missing events. A sudden drop in `OrderCompleted` events could provide an initial warning that your payment platform is offline or that a broken frontend build has been deployed, for example. ## 9. Poor Error Handling Events need proper error handling so any failures can be caught and retried. Problems that relate to either event publication or subscription can cause the loss of multiple functional areas, so it's important to preempt errors and implement protections that help prevent them. Error handling is also tied to monitoring. You should have visibility into where errors occur and which events see the highest number of failures. If events are allowed to end in an unhandled error, the event could be lost or go unnoticed by its intended destination. When a problem occurs in event publication systems, the event might not get stored at all. This can cause consistency issues when you later replay your events in another environment. ## 10. Insufficient Event Scalability The final anti-pattern is mistakenly viewing event-driven architecture as a lightweight model that can scale indefinitely. Effective event scalability requires sufficient infrastructure to stream your events to your different services while maintaining high performance. Poor scalability causes bottlenecks as the effects of actions are deferred. Several seconds or minutes could elapse before a new event is communicated to subscribers. There might then be a further delay before each subscriber has the capacity to process its event queue. To maintain scalability, you should select event streaming platforms like [Kafka](https://kafka.apache.org) that are designed for high throughput and distributed environments. The rest of your stack also needs to be optimized for performance, including the message brokers and [event API gateways](https://www.aklivity.io) that you deploy. ## Conclusion In this article, you learned about some of the biggest anti-pattern gotchas when you're building event-driven systems. Although event-based architectures produce flexible, decoupled components, they're often complex to design and reason about unless you carefully plan them in advance. You should use consistent event structures, types, logging, and monitoring to standardize your system's architecture and inspect its activity. Robust scalability is also important, ensuring you'll avoid performance slowdowns as event volumes grow over time. Finally, you should make sure events are properly stored and test that they can be reliably replayed. This will help you debug problems and recover from any incidents that occur.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.