What is it?
The pipeline architecture is one of the most common architecture styles used in designing software systems. Also known as pipes and filters, it consists of a series of discrete steps performed in a predictable sequence. This is different from the model-view-controller pattern in a layered architecture. In this article I’ll define what it is and when to use the pipeline architecture style.
A pipeline is commonly used in an event-driven architecture. A business event happens in one system or application, and other systems need to be updated accordingly.
Here is a simple example of a pipeline architecture:
A “filter” in this context is a discrete processing step, such as transform, lookup, enrich, or write to a database. A filter does one thing and one thing only (the Single-Responsibility Principle).
The pipes are one-way conduits from one filter to another. They take the output of one filter and pass it on to the next one. The payloads that the filters send through the pipes can be any data format, but typically they are JSON or XML.
When To Use the Pipeline Architecture Style
The most common use case for a pipeline architecture is system integration. Any time you need to move data from one application to another, a pipeline is a very suitable style. Because you define the integration solution as a series of discrete steps, it makes for a very modular design. The filters themselves are highly cohesive and loosely coupled, meaning they do one thing only, and a filter has no dependency on other filters. This makes unit tests much easier to write, and it leads to a cleaner design that you can modify with minimal fuss. Changes to a filter mean a trivial impact to the overall integration, provided there are no changes to its output contract.
System integration is a use case that has been around for several years. Many organizations need to solve the problem of moving data from one application to another in response to some business event. These applications are often very different in terms of the data structures they expose. Connecting these applications is what system integration is all about.
Over the years, a set of common design patterns have emerged. Gregor Hohpe and Bobby Woolf captured these patterns in their seminal book Enterprise Integration Patterns. The companion website does a great job of outlining the various patterns. If you design or develop integration solutions, this book should be in your library and the website should be in your browser bookmarks.
Designing a Pipeline Architecture
System integration typically consists of a combination of Validate, Extract, Transform and Operate (VETO) steps. Begin your design effort with the target system. Determine what it needs in terms of data structure, fields and values, then figure out what it takes to get the source data to there.
For example, suppose you have an insurance claim management system. Various claim intake systems accept a new claim and publish it as an XML message on a message broker. You need to accept an XML message from the message broker, and call a web service to save the claim in the claim management system. Along the way you need to convert the XML message to JSON, and enrich the message using a database lookup to add some additional information. Here is the architecture of our example:
Spring Integration makes this simple. You use the framework to configure the construction of the pipeline, then all you need to do is implement the logic in the filters. The framework takes care of passing the message payloads from one filter to the next.
A pipeline architecture is extensible. Should you need to add additional steps, such as logging or enhanced error handling, simply insert them in the pipeline.
The cost of implementing a pipeline architecture is quite attractive. It’s normally implemented as a monolith, so it’s relatively easy to understand and build. You don’t have the complex orchestration or choreography you normally see with distributed architectures like microservices.
When you use a framework like Spring Integration, the cost to develop is low since the framework takes care of the undifferentiated effort of moving messages from one filter to another. You just develop the filters. Moreover, Spring Integration builds on Spring Framework. If you are already using the latter as your dependency injection framework, developers can learn the former with minimal effort.
Spring Integration reads and processes each message in a separate thread. This gives some boost to performance while isolating any processing failures to the message in question.
Since a pipeline is frequently implemented as a monolith, it means changes of any size need to go through a full regression test. It is also subject to the same ceremony of an all-or-nothing deployment, along with the attendant risks.
A pipeline architecture is not very fault tolerant. If one of the filters causes a heap space exhaustion situation, the entire integration comes crashing down. You can however mitigate this with a decent monitoring and alerting strategy.
If you have a need to integrate two or more systems together, give serious consideration to a pipeline architecture. Avail yourself of the examples in the Enterprise Integration Patterns book.