What are Enterprise Integration Patterns?
Design patterns, in general, and enterprise integration patterns, in particular, are common solutions to common problems where there is wide consensus on their fitness for use. They are technology agnostic. These solutions emerged and evolved over the last 20-plus years in response to problems we see in practice over and over again.
Gregor Hohpe and Bobby Woolf wrote their seminal text (paid link) on enterprise integration patterns back in 2004, to establish a common vocabulary of common patterns that have proven to be effective. The book does not deal with specific technologies that would rapidly fall out of date. Rather, it is all about design patterns that have withstood the test of time.
In this post I’ll introduce you to enterprise integration patterns. I’ll explain why you should consider using them, and the pitfalls to watch out for. Finally, I’ll give you a high-level overview of the main pattern categories. By the end you’ll be able to make an informed decision on using integration patterns, and be able to list their major categories. Future articles will provide examples of these patterns.
Why Bother With Enterprise Integration Patterns?
Organizations usually have several applications that serve different business purposes. Small to medium sized ones can easily have dozen or more applications. Large enterprises can have hundreds, or even thousands of applications. These are often a mix of custom ones developed in-house, third-party applications installed on-premise, cloud-based solutions, and even legacy systems. They typically run on a mix of different platforms and operating systems.
How, you may ask, did we get to this point? Who allowed the technology landscape to become so seemingly fragmented? Well, despite the marketing hype of the larger vendors of business suite solutions, no one solution can do it all. There will be some functions that other smaller systems do much better than an all-encompassing business suite. Organizations adopt the strategy of picking the “best of breed” for a particular business domain. I saw this in the 2000s when the big ERP vendors like SAP, Oracle, PeopleSoft, etc. sold customers on these all-encompassing solutions. Inevitably, organizations ended up retaining some of their niche applications because they do their job so much better than the big ERP system. I still see this today, albeit with different vendors. Inevitably, organizations need to have these large platforms exchange information with their smaller, niche applications.
SaaS For Everything?
Even today there are numerous cloud-based Software as a Service (SaaS) solutions that cater to specific verticals. There are specific solutions for property & casualty insurance, healthcare, public utilities, manufacturing, the list goes on. Each of those solutions do the core job they were designed to do, and they do it very well. But drop them into the context of your organization, and you’ll find some of their peripheral functions don’t meet your needs quite as well as some other applications you have. Now these applications need to exchange data between each other.
Business customers and internal users don’t want to care about the arrangement of the many applications in your technology stack. All they want to do is go to one application to place an order, file a claim, refer a patient to a specialist, view work-in-process, whatever. A single use case like one of these can span multiple applications, so there is a need to gather data from, and propagate data to them.
Different Data Structures
You are going to be figuring out how to convert the data structure from the source system uses to ones the target systems use. Moreover, you have very little control over the data structures of the participating applications. You certainly aren’t going to ask a team to alter the structure of the data they accept just to satisfy the one integration problem you’re working on. Even more so, third-party applications will seldom alter their structures just for your organization’s use case. So that’s a constraint to deal with.
Working Across Teams
Individual applications usually focus on one functional area of the business. Enterprise integration involves connecting applications from two or more of these functional areas. For example, you could be connecting Sales & Marketing’s CRM system with Manufacturing’s production management system and with the Finance department’s accounting system. Given that these systems often mirror the communication structure of an organization, it means you’re working with different groups of people across different business domains. It’s Conway’s Law at play. It means your ability to negotiate with these teams, with each having their own agenda skills, becomes very important.
Yes, that sounds like politics, and to a certain extent it is. Think of it as understanding what’s important to them, asking for their help in solving a problem you have, finding common ground you can agree on, and fitting all that into the larger needs of the organization. Simple to fit in to one sentence; hard in practice.
Greater Business Impact
There are positive and negative consequences of enterprise integration because it spans multiple areas of the business. When an integration solution works, it’s pretty cool. When I see data flowing smoothly through a client’s integration solution, I see revenue coming in for them, I see goods being shipped, I see claims being swiftly adjudicated, whatever their mission is. On the other hand, when it breaks, it impacts two or more systems. An outage can affect more than just one business unit, and many more stakeholders. This can hurt the organization due to losing money, alienating customers, and/or going offside with respect to regulatory compliance.
I mentioned you have a message broker in between the applications that need to exchange data. But somehow you need to convert the data from the format the producer generated to one that the consumers understand. Where does that fit in with enterprise integration patterns?
The answer is an application that sits between the producer and the consumer. Its job is to listen for messages from the producer, and convert them for the consumer. It looks like this:
System A publishes a message to a message channel in the broker. The Integration Application is listening on this same channel. When it sees the message, it performs the conversion, and updates System B.
Another way to accomplish the same thing is for the Integration Application to publish a converted message to a second channel:
Instead of making a synchronous API call to System B, the Integration Application publishes a converted message (the orange one) to a second channel. System B listens on this channel, and consumes the message.
The beauty of an architecture having a Message Broker with channels is it is very effective at scaling, and handling disparities between producers and consumers. Where you have a producer kicking out messages faster than a consumer can handle them, they simply queue up in the channel. The consumer picks them off one by one, and processes them as fast as it is able. The downsides are a bit more complexity by having a message broker, and errors are a little harder to track down.
You may be wondering, why not fold the Integration Application into System B, so that System B does its own conversion work? You would end up with one deployment unit instead of two, so that certainly simplifies the architecture. But the price you pay is in maintenance and QA. A change to how the integration works is a change to System B’s code base. Prudence demands you test all of System B’s functionality before deploying it. This may not be a bad choice; it all depends on your context, and the trade offs you choose.
Categories of Enterprise Integration Patterns
Integration style is simply the answer to the question “How does the data physically get from System A to System B, System C, and so on?”. The first answer that might come to mind is a simple web service call, an in-process method call or a remote procedure call. While this can work for the very simplest integration solution between two systems, it falls apart when you make changes to one or both of those systems. The two systems know too much about their inner workings, thus creating a tight coupling. Furthermore, it becomes a real architectural and maintenance headache as you add more consumers (System B, System C, etc.).
Instead, it’s all asynchronous. It is far better for System A to simply deposit its information in some common location, then go about its business. This common location can be a database, a shared file system directory, a secure FTP server, or a message broker. System A doesn’t wait for the consumers to do their processing. In fact, it has no clue as to who the consumers are, or how many of them there are, nor does it care. Meanwhile, System B (and System C, and so on) are watching this common location, they see there is work for them to do, and they independently process the information.
These consumers need not be up and running when System A produces its work. The information will be waiting for them when they come back online. Such an arrangement gives you a more resilient and fault-tolerant architecture.
Practically speaking, messaging brokers (enterprise messaging systems) are the most common style of integration. They come with mechanisms to handle one-to-one and one-to-many relationships between producer and consumers. You need not worry about a consumer trying to pick up a file before the transfer process has finished. Message brokers make the message available to consumers only when the producer finishes publishing it to the broker. They’re fast since much of the work happens in-memory. Most of them persist messages to the file system to enable recovery from a system failure. Some examples of message brokers are Apache ActiveMQ, RabbitMQ, and IBM MQ. Apache Kafka is a variant of message brokers that streams data rather than using discrete messages. It’s wicked fast and extremely scaleable.
These are sort of like pipes in Unix. They are a virtual connection between producers and consumers. You create message channels when you are defining how information will flow between systems. Channels are implemented in message brokers as either point-to-point (Queues), or publish-subscribe (Topics). Point-to-point is where only one consumer listens on the channel. Publish-subscribe (“pub-sub”) has many consumers listening on the same channel, each receiving their own copy of the producer’s message.
It’s important to consider message channels in two different, but related, contexts. The larger context is that of the message broker. Integration architects and the software teams will define the queues and topics in this messaging system. Applications connect to the messaging system by publishing to, listening on, or subscribing to queues and/or topics that are of interest to them. For instance, an e-commerce application publishes messages to an topic named
OrderCreated. The shipping system is going to subscribe to this topic so it knows when to pick, pack and ship the order. Similarly, the inventory management system subscribes to
OrderCreated so it knows to update the inventory status of the items in the order.
The smaller context is your Integration Application, the code that does the actual message conversion. It breaks down the conversion into several steps, each connected by a message channel. These channels are not the ones in your message broker; they are contained within the Integration Application.
Messages are the data objects that systems produce and consume over a messaging system. They contain the data that a producer thinks other systems may be interested in. Messages have a structure to them, consisting of a header and a payload. The header is mostly metadata, along with some routing information. The payload is the actual information in JSOM, XML, YAML, or even binary formats.
Pipes and Filters
These are collections of routers, transformers and endpoints (“filters”), each connected by a channel (“pipe”). They perform some combination of Validation, Enrichment, Transformation, and Operation (VETO) on a producer’s message to make it usable by consumers. For a deeper dive into this concept, have a look at When to use the Pipeline Architecture Style.
This is an intermediate step that makes a decision where to send the message to. It can be a filter that discards messages that are of no interest based on its attributes. It can route the message to one of several channels, based on a data attribute of the message. Typically you find routing steps as part of a Pipes and Filters arrangement. Based on attributes in the message header or payload (or both), it decides which endpoints to send the message to. These endpoints could be other channels, web services, ERP APIs, whatever.
This is where much of the work in enterprise integration happens. This is often found in a Pipes and Filters arrangement. Essentially this is where you convert the message from the producer’s format into one that the consumer accepts. This can be as simple as converting from one format to another. It may also make calls to other systems to get additional information to add to the message.
These are what connect an application to the messaging system. In the Java/JVM space, Java Message Service (JMS) is the specification for how Java applications talk to messaging system. A very popular implementation of JMS is Spring JMS.
Getting applications to exchange data is a problem space where solutions have evolved over the past couple decades. Enterprise integration patterns are common solutions to these common problems, and are applicable in most contexts. Though there is some added complexity to these patterns, the business benefits they provide are usually a worthwhile trade off. Major categories of these patterns are integration styles, message channels, messages, pipes and filters, transformation, and endpoints. In future articles, I’ll examine these categories in detail and provide some code examples to show how they work.