Anda di halaman 1dari 101

Polish-Japanese Institute of Information Technology

Chair of Software Engineering

Master of Science Thesis

Design patterns
in application integration
based on messages

by

Mariusz Pikula, Adam Siemion

Supervisor: Dr Piotr Habela

Warsaw, 2007
Abstract
This thesis is devoted to the issues connected with application integration. It presents
the importance of the knowledge of the design patterns connected with this subject.
At the beginning this thesis introduces the reader to the area of application integra-
tion and different types of problems connected with it. The thesis explains why the
task of integration is being undertaken and why it can be very difficult and compli-
cated. Next, the integration styles are being presented starting with the oldest and
the simplest ones going through more complex ones and ending on the integration
based on messages, which forms the main area of interest in this thesis. After fa-
miliarising the reader with the basics of this integration approach the thesis is aimed
to provide him/her with the essential theoretical background, which covers knowl-
edge about basic terms and design patterns connected with the integration based
on messages. Afterwards, the practical application of the discussed terms is being
shown based on the case study describing system integration issues. Last part of this
thesis is dedicated to the integration platform that has been created as an integral
part of the thesis. The description of this platform contains information about used
technologies, application architecture and an example of its usage based on the case
study presented earlier.
Contents

Contents i

1 Introduction 7
1.1 Loose Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Integration Styles 17
2.1 Application integration . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Application coupling . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Integration simplicity . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Integration technology . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Data timeliness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Data or functionality . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.8 Asynchronicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Styles of integration . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.9.1 File Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.9.2 Shared Database . . . . . . . . . . . . . . . . . . . . . . . 24
2.9.3 Remote Procedure Invocation . . . . . . . . . . . . . . . . 26
2.9.4 Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Messaging based systems 29


3.1 Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Message Implementations . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Message Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Message Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Message Transformation . . . . . . . . . . . . . . . . . . . . . . . 35
3.6 Message Endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.7 Synchronous and asynchronous communication . . . . . . . . . . . 39

4 Design patterns in the application integration 43

5 Enterprise Service Bus 51


5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 Message Oriented Middleware . . . . . . . . . . . . . . . . . . . . 52

i
ii CONTENTS

5.3 Tightly coupled interfaces . . . . . . . . . . . . . . . . . . . . . . 53


5.4 ESB aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.5 ESB capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6 ESB components . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.7 Open source ESB products . . . . . . . . . . . . . . . . . . . . . . 58
5.8 ESB integration patterns . . . . . . . . . . . . . . . . . . . . . . . 59
5.8.1 VETO pattern . . . . . . . . . . . . . . . . . . . . . . . . 59
5.8.2 VETOR pattern . . . . . . . . . . . . . . . . . . . . . . . 60
5.8.3 Two-step XRef pattern . . . . . . . . . . . . . . . . . . . . 60
5.8.4 Forward Cache Integration pattern . . . . . . . . . . . . . 62

6 Case study: Messaging systems work principles 65

7 Implementation 73
7.1 The origin of the name . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3 Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.4 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.5 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.6 Processing sequence . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.7 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.7.1 Configuration model . . . . . . . . . . . . . . . . . . . . . 79
7.7.2 Configuration example . . . . . . . . . . . . . . . . . . . . 80
7.7.3 More on transformers and routers . . . . . . . . . . . . . . 81
7.7.4 Performing the configuration . . . . . . . . . . . . . . . . . 82
7.8 Integration design patterns supported by pESB . . . . . . . . . . 83
7.9 Problems encountered during the implementation . . . . . . . . . 84

8 Summary 85

Bibliography 87

List of Figures 91

List of Tables 93

Index 95
Acknowledgements

We would like to thank Dr Piotr Habela, who conducted a class on the software
integration (Technologie Internetu), which inspired us to perceive more informa-
tion about this subject.
I, Adam Siemion, would also like to thank Remigiusz Weska, with whom I had
have the pleasure to work and the company IMPAQ Sp. z o.o., where we — both
— had been working on a project, which aimed to choose the best messaging
based integration solution that would fulfill the requirements of our customer.
That experience also motivated me to delve into the subject of integration.

1
Preface

Business environment is a constantly changing volatile environment that requires


great flexibility from its participants. This flexibility requires the will to cooperate
with each other in order to maximise profits or obtain other types of benefits.
The cooperation between two business entities involves information exchange and
collaboration of IT systems used by those entities. This collaboration might be
limited only to the exchange of data or might be as complex as the usage of the
partners software functionality.
Combining two separate IT systems into one entity, capable of exchanging
information and maintain a constant data flow between them, can be a very
complicated task, especially in the business environment. This environment has
a lot of features that contribute heavily to the difficulty level of this task.
First of all, the system used by the business entities might not be an up-to-
date IT system, but a legacy system designed in the early 90s or even earlier.
Those systems can be very complex and vital for the business so that it is not
possible to simply replace them with new ones. Very often the complexity of
those systems makes it too expensive to design, implement and introduce a new
system with the same functionality as the previous one. The human factor also
has to be taken into account, workers that have used the old system for a couple
of years might resist against the introduction of a new system that will replace
the one they are used to.
Moreover, business entities that want to cooperate might be located in distant
geographical locations. This fact adds additional issues into consideration. Issues
such as communication reliability, security, communication errors handling, non-
repudiation and so on. The reliability of the communication depends also on
the third party business entities such as external Internet providers, who are
responsible for maintaining the Internet connection. Thus, there are also external
factors that have to be taken into account while coping with this problem.
Designing such a solution that would allow to combine the application used
by both business entities is another issue that must be considered. First of all,
the designers of such a solution must have a deep knowledge about the business
processes in each of the entities in order to design an effective solution. Secondly,
both of the interested entities must agree on such a solution. This can be difficult
to achieve, because very often each of the business entity representative would
put pressure on the designers so that the outcome solution would be based mainly
on their IT structure (internal data model, processes, etc.). Creating a design

3
4 PREFACE

that would satisfy all of the participants can be a demanding and difficult task
itself, not only because of this aspect, but the designers must also concentrate on
the functionality, flexibility and reliability of the solution, which makes this task
even more difficult.
After the integration project is completed, the appropriate tool must be cho-
sen to put the whole solution in motion. The IT market offers many available
solutions that are specifically designed to solve those type of problems (i.e. build-
ing integration solutions). We can choose from both open source solutions such
as the Mule or ServiceMix and propriety products from IBM, Oracle, Sonic Soft-
ware, BEA and so on.
Upon taking a closer look at those products and considering the problems and
challenges of the application integration topic, we decided to make it our object
of interest and the topic of this thesis.
Having a limited amount of time and available resources we did not aim
to create a solution that could compete with those made by large IT companies.
Instead, we decided to take a different approach — create a lightweight integration
platform. It would provide the basic functionality needed for an application
integration combined with the ease of use and a short learning curve, so that
the potential user with a knowledge in programming would be able to effectively
create an integration solution without the need to sacrifice a large amount of time
to learn the functionality of the software program itself. The simplicity of our
platform, in comparison to the large and complicated tools offered by large IT
companies, would be its greatest strength. We aimed to create a tool, based on
widely available technologies, that could be used as a base for further development
by adding extensions to it.
The work itself has been structured in such a way that a reader would get
an overview of the whole topic of integration and especially messaging systems
before moving forward to the description of the created integration solution itself.
The thesis is divided into two main parts. Part one contains chapters two,
three, four and five, which describe the basic theory behind the subject of integra-
tion. While part two introduces the reader to the integration solutions currently
available on the market such as Message Oriented Middleware (MOM) and En-
terprise Service Bus (ESB), which are heavily using the concepts depicted in the
part one and presents a case study that is aimed to present the usage of terms
and concepts presented in the first part of this thesis from a practical point of
view. Part two contains chapters six, seven and eight. The ninth and last chapter
contains the summary of our work.

1. Part one begins with an introduction to the integration topic in chapter


one.
2. Chapter two concentrates on different integration styles, with the detailed
description of each style along with its advantages and disadvantages as
well as the situations, when each of those styles might be used.
3. Chapter three concentrates on one particular integration style — the mes-
saging. This style is the one that will be used in our own integration
solution. In this chapter we will try to give a more detailed description of
PREFACE 5

this integration style. The description will cover two different concepts of
communication using this integration style — the synchronous and asyn-
chronous one and basic concepts directly connected with this integration
style — Message, Message Channel, Message Router, etc.

4. Chapter four covers the concept of the design patterns in messaging systems.
It gives the depiction of several selected design patterns with possible ways
of usage and different variants that can be applied in different situations.

5. Part two begins with chapter five, which will introduce the concept of an En-
terprise Service Bus (ESB). First, we will define briefly what an ESB is,
then we will focus on the basis of the ESB — Message Oriented Middle-
ware (MOM), advantages of introducing an ESB, its capabilities and finally,
we will provide a couple of ESB integration patterns.

6. Chapter six presents a case study, which aims to present the usage of the
concepts from the previous chapters in a real life business example. It starts
with a general overview of the problem and goes through all phases of the
integration process up to the final solution.

7. Chapter seven will be devoted to the description of the created integration


solution. At the beginning, we will explain the concept of this approach to
the integration, the technology it is using and its architecture. To better
illustrate the way this system operates we will provide detailed sequence
diagram with the detailed description of sequence of actions that are tak-
ing place when two systems communicate using our solution. Also, we
would like the reader to be able to solve real integration problems using our
product, thus as an example of usage, we provide an imaginary integration
problem with possible solution using our system.

8. Finally, chapter eight will contain the summary of our work with final re-
marks regarding the goals that we managed to achieve and the ones that
have not been achieved, along with possible reasons why it has happened.
It will also contain the suggestions about the possible ways of further de-
velopment and some ideas of extensions that could be made to make the
existing tool more usable and enriched with new functionalities.
Chapter 1

Introduction

The task of integrating computer systems emerges as a response to a frequent


need to connect multiple separate computer systems. Integrated systems are
supposed to cooperate and provide unified functionality. Moreover, it is expected
that the new — integrated — system will be operating on the data gathered by all
of the participating computer systems. There are a lot of factors that make this
a difficult and challenging task. Systems, which are supposed to be integrated,
might operate within different organisations, might be designed using different
technologies, might be legacy systems with no maintainers etc.
The need to integrate might arise because of multiple reasons, for example, the
merger of two companies, the merger of multiple branches of one company, the aim
to have one system responsible for coordination of others. Another reason might
be cost reduction. One unified system is cheaper to support and more effective in
terms of business processing than two systems working separately [16]. Expected
results of the integration are usually as follows:

• cost reduction

• increased effectiveness

• improvement of business processes

• unified flow of information

Applications, which are supposed to be integrated can be of different origin.


They can either be custom made according to the customer specification to suit
customer needs or bought as a commercial-off-the-self (COTS) application and
tailored to meet the requirements of the buyer. The main differences between
integrated systems can also be of other kinds. They might be written in different
programming languages and be of a different age. One of those systems, or even
both of them might be a, so called, legacy application [7] — computer software
created couple of years ago, not being improved any longer, quite often with no
documentation available and designed as a local standalone system. Usually,
those systems provide no communication capabilities. Network communication

7
8 CHAPTER 1. INTRODUCTION

and data exchange have not been taken into consideration during the design of
those kind of systems.
What makes the process of integration even more difficult is the fact that:

• systems which are to be integrated can be spread geographically with ma-


chines, on which they are running, placed in distant locations

• systems might be written in different programming languages

• systems might not have any documentation

• systems might be running in different environments (operating systems,


hardware configurations, etc.)

• systems might be managed by different organisational units or even com-


panies

• companies, which provide proprietary software, are not willing to partic-


ipate in the integration process, because they do not want to reveal the
internals of their products

• workers may not be willing to adapt to changes made by the integration


process

Reliable communication between integrated systems must be assured using


available communication facilities. The integration should be introduced without
making too many changes to any of the existing systems. There is a very im-
portant reason behind this limitation. Computer systems such as those running
within some financial institutions and managing data crucial for its activities can-
not be redesigned and reimplemented from the scratch. It would be too expensive
and risky for the business. The complexity and scale of those systems makes the
processes of recreating them costing too much for an institution to afford it.
All those problems and difficulties must be taken into consideration when
an integration solution is being designed for a computer system. Because of
the above reasons, the integration of multiple various computer systems can be
considered as an especially difficult and challenging task.
The aim of this thesis is to present the subject of an application integra-
tion from the theoretical point of view and use the presented knowledge to build
a lightweight easy-configurable solution that will allow to integrate different sys-
tems in a quick and efficient way. The goals that are to be achieved and the basic
concept of this system have been outlined in the first chapter of this thesis. Be-
fore going further on to the system architecture description, the way it has been
designed and implemented along with the detailed description of its concepts and
goals that have been set during the process of creating it, a brief overview and
background on the subject of system integration will be given. This overview
should allow the reader to get an overall view on the topic and gain knowledge
about the concepts and terms used through this document such as loose cou-
pling, design patterns and others that are essential for understanding the basics
of a system integration.
CHAPTER 1. INTRODUCTION 9

1.1 Loose Coupling


Loose coupling [11] has recently become a very popular term directly connected
with application integration. It can be said that loose coupling is one of the key
concepts formed around this topic. This strong relationship makes it necessary
to explain this term prior to moving forward to the next chapters. This will allow
to have a full understanding of the topic, which is being described in this thesis.
The main concept of loose coupling is that two communicating parties (sys-
tems or applications) should make minimal assumptions about each other. In
other words, the less applications need to know about each other in order to
cooperate properly, the better. Applying this principle should, in the long term,
reduce the costs of maintaining an integration solution, make it more effective
and reduce the costs of changes of integrated applications.
Loosely coupled applications can be modified independently (to some degree
of course), which means that changes made within one application do not enforce
changes in the coupled application. Therefore, using loose coupling in applica-
tion integration makes an integration solution more flexible and change tolerant.
This flexibility derives from the fact that connected applications do not have to
be adjusted after changes done in on of the systems taking part in the communi-
cation.
If processing of one of the applications is based on the information about
the internal business logic of the other application (e.g. data format), then any
changes made to that logic will automatically enforce the changes within other
application. Thus, the less dependent, on this type of information about each
other, the applications are, the more flexible communication between them can
be maintained.
Application coupling is a multidimensional issue that covers not only the
issues connected with the design and implementation of the application. Having
that in mind it seems obvious that loose coupling between applications must
be viewed from different aspects. For example, integrated applications can be
loosely coupled in time — they do not have to be working at the same time to
cooperate — using queues. They might also be loosely coupled in format — every
system might have a different data model — using a component responsible for
transforming messages exchanged between the applications.
Measuring the degree of loose coupling is a separate problem. When an in-
tegration solution is being designed with the concept of loose coupling in mind,
it would be desired to have the means to measure how loosely coupled the inte-
grated applications really are. In order to be able to do this a measurable feature
of both applications must be found. Such a feature should allow to conduct
a measurement and give results that can be compared. The feature that could
be used for this purpose is the number of changes that can be made within an
integrated system without the need to interfere with the integration solution as
a whole. Changes can concern business processing within an application, data
format and so on. The more changes can be made within an application without
the need of altering the solution itself, the higher the degree of loose coupling is.
Of course, many measures can be thought of and used. But this one (number of
changes) concentrates on the key concept of loose coupling and from that point
10 CHAPTER 1. INTRODUCTION

of view, it might be considered as the best one to use. Moreover, this measure
emphasises the main idea of loose coupling that the integrated systems should
make as little assumptions as possible about each other. The fewer assumptions
will be made, the more changes can be made inside the connected system without
affecting the operation of others.
The opposite concept of loose coupling is tight coupling. Tight coupling
might be depicted using local method invocation as an example. Local method
call imposes a lot of assumptions on the caller, which are as the following:

• called method must be written in the same programming language as the


calling method
• the exact number and type of the arguments of the called method must be
known
• called method must run in the same process
• both calling and called method must use the same data types format
• both calling and called method must use the same internal data represen-
tation format

The consequence of those assumptions is that tightly coupled local method


invocation differs in many ways from loosely coupled communication, based on
the messages for example. First of all, local invocation starts to perform its ac-
tivities immediately after receiving a call from the calling method, thus there is
no latency between the method call and start of the processing. Local method
invocation is very fast, efficient and reliable in comparison to a remote commu-
nication. Although, the calling method must wait for the called method to finish
its processing and return the result (synchronous method call), the processing
speed is much greater than in the case of a remote communication. In case of
a remote call waiting time increases because of the distance, latency of the con-
nection between the applications, connection medium quality and so on. Because
of the communication speed and the fact that the method invocation is local, the
security and communication issues (security, reliability, performance, efficiency)
that would come up in case of a remote call, are not a problems that have to be
worried about. When a local call is being invoked, it is certain that it will reach
its destination and will be performed. In case of a remote call there is no guar-
antee that the call has reached its destination. There might be a communication
failure, connection might be broken, the remote host might be out of order or
even the data can be altered while being transferred from the sender to the caller.
All those problems make a remote call far less reliable than a local call.
As it can be easily noticed, local — tightly coupled — method invocation is
simpler, more efficient and generates far less issues to worry about. The source
of this efficiency and simplicity lays in the assumptions made by the applications
about each other.
In order to make integration easier, many communication technologies use the
same semantics as a tightly coupled local method calls, to invoke their function-
ality and exchange data between applications. Those technologies include:
CHAPTER 1. INTRODUCTION 11

• .NET Remoting [15]

• Java Remote Method Invocation API (Java RMI) [8]

The main advantage of this approach is that it is easier for the developers, who
are used to invoking local methods, to start using these technologies. Therefore,
using those techniques may lead to making the same assumptions as in the case
of local method invocation. However, it should be kept in mind that while those
assumptions are valid in local environment, many of them are not valid in the case
of remote calls (making them valid, if possible, will greatly restrict the flexibility
of the integration solution).
When integrating applications it is not usually desired for a calling applica-
tion to wait until the results of the remote processing will be available. Such
a waiting, at best, might lead to delays that are very often not acceptable by
the business entities taking part in the integration process. Moreover, in case
of a communication failure (e.g. due to lost of connectivity) an application can
get suspended waiting for the response. This can lead to an application crash.
The processing time of a remote call is also much longer than in the case of a local
call, and the call as a whole is far less reliable. When a remote call is being made
it cannot be assumed, as in the case of local call, that a response to this call will
be received, because there might be communication failure, crash of the remote
system, and so on.
One of the assumptions made in the case of tight coupling is that both called
and the calling method are written in the same programming language. This as-
sumption significantly reduces the scope of possible integration scenarios, because
it is only possible to integrate applications written in the same programming lan-
guage (e.g. it is not possible to integrate applications written in Java and C#
using JAVA RMI). This restriction highly reduces the flexibility and scope of
applications of the technologies mentioned above. What is more, it makes it im-
possible to integrate newly written systems with the legacy systems. As it can
be easily seen, this approach is also burdened with problems that do not occur
in the case of tightly coupled local calls.An example showing what problems can
appear while trying to integrate systems with tightly coupled dependencies can
be found in [6], along with the detailed description of this approach and problems
that it might cause.
Loose coupling apart from being a popular term is also one of the core concepts
of an application integration. By making integrated system less dependent from
the things such as the programming language in which they are written, their data
model, internal business logic and architecture they are more flexible and change
tolerant. This approach allows to modify one application to some point without
the negative effects on the communication with the other system. This assures
flexibility, which cannot be achieved in case of tight coupling due to restrictions
mentioned earlier.
Apart from such benefits as flexibility, scalability, higher tolerance for internal
changes loose coupling has also some disadvantages. Designing a loosely coupled
integration solution is a more complex task than in the case of tightly coupled
solutions. A lot of new problems need to be solved in order to effectively perform
12 CHAPTER 1. INTRODUCTION

a loosely coupled integration solution. The difficulty in designing such a solution


also results in more difficult development, error tracing, debugging, etc.
To sum it up, loose coupling minimises the interdependency among systems
in terms of time, information format, and technology at the cost of more sophis-
ticated design and implementation.
Now, since the loose coupling term and its role in the application integration
has been covered, let us move on to a more detailed description of different
integration styles along with the benefits they bring and — of course — threats,
which might occur while using them in integration solutions. But before that
a case study will be presented as a practical illustration of the discussed issues.

1.2 Case study


An example will be discussed now in order to illustrate all those information
provided in the above paragraph. The example will cover a simple case study,
which covers banking application integrated with the front-end Web application.
This application will allow users to transfer money between user accounts. The
applications will communicate using the TCP/IP protocol stack, which is the
most common and wide spread set of communication protocols.
The presented source code snippet accepts the following information on its
input:

• transfer title

• destination account number

• transfer amount

The sample source code snippet, written in C#, could look like this:
String hostName = "www.mybankingapp.com";
int port = 8080;

IPHostEntry hostInfo = Dns.GetHostByName(hostName);


IPAddress address = hostInfo.AddressList[0];
IPEndPoint endpoint = new IPEndPoint(address, port);

Socket socket = new Socket(address.AddressFamily, SocketType.Stream, ProtocolType.Tcp);


socket.Connect(endpoint);

byte[] amount = BitConverter.GetBytes(1000);


byte[] transferTitle = Encoding.ASCII.GetBytes("My Transfer");
byte[] destAccNumber = Encoding.ASCII.GetBytes("1234445321234321");

int bytesSent = socket.Send(amount);


bytesSent += socket.Send(transferTitle);
bytesSent+=socket.Send(destAccNumber);

socket.Close();

Above source code excerpt first initiates the connection to the banking system,
then sets an amount of money that will be transferred to the destination account
along with the transfer title and the destination account number and finally sends
that information as a byte stream. Of course in the real life this method would
CHAPTER 1. INTRODUCTION 13

be much more sophisticated, but the goal here is to show general concept, not to
write a complete business solution.
The communication solution presented above is quite straightforward and
simple. It does not require the usage of any sophisticated integration software.
But this solution carries hidden problems that can be very hard to track and
repair.
In multiple books about network programming the above solution would be
presented as the one, which enables to communicate the client (presented above)
with the server regardless of the operating system and programming language
these two systems are using. This is not completely true, as it will be explained
later.
In order to obtain data that will be sent to the banking application, the
transfer amount, transfer title and destination account number are converted to
arrays of bytes. Then each of them is sent to the destination. The BitConverter
class is used to convert transfer amount to the array of bytes. The conversion
made by this class is performed using internal memory representation of a given
data type (integer in this case). .NET uses 32-bits integer type and this type will
be used in this case to make the conversion of integer to a array of bytes. Other
systems might use not a 32 bit representation, but a 64 one for example. In case
of a system using 64 bit representation it will read not 32 but 64 bits from the
incoming byte stream. What does this mean in the case of our example? If the
destination system uses the 64 bit integer representation it will read not only the
4 bytes of the transfer amount but also the preceding 4 bytes of the transfer title
and try to interpret whole 8 bytes as an integer. This difference in data types
would cause a different amount of money being transferred than the user had
initially desired! Apart from that, for the same reason the destination account
number would be different than the one given by the user. Such a behaviour is
at least undesirable and will lead to the disastrous effects both for the bank and
the client.
Moreover, client and bank computer systems may use different formats to
store numbers. One of them may use big-endian system, which stores numbers
starting with the highest byte first, while the other one may use the small-endian
system, which stores numbers starting with the lowest bytes first. This will also
cause difference in the transfer amount!
At least two assumptions must be made about integrated systems in order
for the above solution to work properly. First one that both of them have to use
the same data types, and the second one that both of them have to use the same
internal number storage format. However, it is not the end of the restrictions
imposed on by this approach.
Upon the closer examination of the above source code, a couple of things might
be spotted. First of all, the connection information have been written directly in
the source code. Any change concerning this data, like changing the destination
host name or adding an alternate destination address would require altering the
source code. In order to take effect of those changes the whole application would
have to be recompiled and redeployed. In the case of a simple application it
might not appear as a difficulty, but when more complex, critical applications
are being concerned such a way of performing changes can become a very serious
14 CHAPTER 1. INTRODUCTION

issue. It would significantly increase the cost and time needed to introduce even
the simplest change to the application. The above source code should be written
in such a way that changes could be made to it in the most efficient possible way
(efficiency in that case covers both time and cost efficiency alike).
Furthermore, the usage of client-and-server mode assumes that both the server
and the client are connected to the network at the same time. If one of the
participants is currently not available, because of network problems, too high
network traffic, connection link problems, etc., then the connection cannot be
established and the data cannot be exchanged.
It has already been mentioned before that in case of the presented solution
any changes made to the application require changes within the application code
itself. Those changes would have to be made each time a destination of the request
being sent would change (that would involve changing the code, recompiling it
and redeploying application).
The same way of introducing changes to the application — changing the
source code, recompiling it and redeploying — would also have to be taken also
if there would be a need to change the number of parameters being sent to the
banking application. But this time the changes would have to take place in both
applications, because the banking application needs to know in advance the exact
structure of the request so it can parse and process it correctly.
This example shows how the tightly coupled solution could look like and what
assumptions must be made in order for it to work properly. To sum it up, those
assumptions are as follows:

1. Both client and host system must use the same internal number format
representation and use the same data types.

2. Both client and host applications must be working and connected to the
network at the same time in order to exchange data.

3. A client application must know the host location during the coding phase
and this location cannot be changed without updating the application code
itself.

4. A host application must know the exact number and type of the request
parameters during the coding phase in order to process and interpret them
correctly. Changes can only be made by changing the host application’s
code itself.

As it has been shown, a lot of assumptions must be made about the appli-
cations in order to make them communicate correctly. Therefore, the presented
solution can be qualified as a tightly coupled and makes a good illustration of
restrictions and limitations of this approach. In order to make this solution more
flexible and less restricted it should be designed as a loosely coupled. Redesign-
ing it to achieve that goal, would mean removing the restrictions limiting the
flexibility of the presented solution. That goal would be achieved if there would
no longer be a need for all those conditions, listed earlier, to take place in order
to make the systems fully functional.
CHAPTER 1. INTRODUCTION 15

The first step on the way of decoupling the previous solution would be defining
a platform independent data format, which would be resistant to issues connected
with different internal format number representations and the usage of different
data types. An XML can be a solution of this problem, it can be used to define
request description format, then requests would be sent as XML documents.
The destination host could parse it and extract all the necessary information to
process the request.
In order to remove the restriction caused by the the assumption concerning the
host location and communication issues, a Message Channel design pattern (3.3)
can be introduced. Message Channel is a logical address, which both the client
and the host application use to communicate. The application has to be able to
connect to the channel only, not directly to the server. This resolves the location
issue — there is no longer the need to know the connection details about the host
application — the channel is used to communicate.
If the channel will be able to store the requests in form of a request queue,
then the necessity for both systems to be connected at the same time will be
eliminated. Every request will be stored in the channel until the destination
system fetches it. The response delivery will work the same way. Thanks to that,
the systems will be able to communicate without the requirement for both of
them to be on-line at the same time.
Applying the above mechanisms to the given problem would change the so-
lution from a tightly coupled into a loosely coupled. With those mechanisms a
solution — far more flexible than the previous one — can be obtained, free of all
the restrictions of the first approach. From now on the client and the host appli-
cation might be developed simultaneously, independently of each other. Changes
done in one participant would not require altering the other one.
However, there are also disadvantages of loosely coupled solutions — the main
drawback is that it becomes much more sophisticated and complicated. This
means that it would take much more lines of code to implement this solution,
and it would not be so simple and straightforward. Also the process of debugging
and testing becomes more complex. It is very important to keep that in mind
while making the decision between loose and tight coupling model.
The above example forms a good illustration of the differences between the
tight and loose coupling. It shows what both of them have to offer and what
their advantages and drawbacks are.
Chapter 2

Integration Styles

As mentioned before, application integration is a task of making two or more


separate systems to work as one - combined - system and share functionality.
Problems usually faced while integrating applications have been briefly described
in the Introduction chapter . In this chapter we will describe different integration
styles that might be chosen when coping with this challenging task.
Every integration problem is different, therefore the choice of an integration
style must depend on the circumstances of a particular situation. In order to
choose an appropriate style - criteria, on which the choice will be made, must be
defined. Martin Flower [7], for that purpose, provides the following:

• Application integration

• Application coupling

• Integration simplicity

• Integration technology

• Data format

• Data timeliness

• Data functionality

• Asynchronicity

When the integration task at hand will be examined and analysed based on
those criteria. Conclusion that will be the result of such an analysis can be very
helpful later on. Basing on those conclusions, the decision on, which integration
style will be the most suitable for the given situation, can be made. Before
moving forward to the description of those styles, a brief overview of each of
the mentioned criteria will be given. This will allow the reader to have a better
understanding of those criteria and enable applying them in a proper way.

17
18 CHAPTER 2. INTEGRATION STYLES

2.1 Application integration


Before starting to think about applying an integration solution one must think
about whether it is possible to develop a new, centralised local application instead
of integrating separate already-existing systems. A local application will be easier
to design and develop with far less issues, such as security, communication, data
exchange, etc., to consider. Of course this might only be done when it can
be afforded to implement a new application from the scratch. As it has been
said before, in many cases it is not only too expensive but also impossible to
recreate the system again from the beginning. Thus, there is no other way than
to integrate existing systems. However, the necessity of integration should be
always thought trough. It should be undertaken only if it is necessary. If there is
another way of solving the problem, in a less expensive and more effective way,
it should be considered before.

2.2 Application coupling


The application coupling has been discussed in details earlier (1.1). The main
rule, which should be kept in mind is that while choosing an integration solution
one should try to pick the one that will enable to achieve the greatest decoupling
of the integrated systems (sample way of measuring the decoupling level has
been described earlier). This will ensure that the systems will be more flexible.
Flexibility allows to make changes within the application without the need to
change the whole integration solution and the way applications communicate
and exchange data between each other. In a long time perspective this feature
might appear much desired. It will allow to upgrade one of the systems or enrich
it with new functionality without the need to change the remaining integrated
applications.

2.3 Integration simplicity


When deciding, which integration style is the most appropriate, the simplicity of
the integration solution should be one of the most important factors being taken
into consideration. It is defined not only by the complexity of the design and
used technologies, but also by the number of changes that need to be done in
integrated applications in order to integrate them into one system with shared
functionality. Although, on the other hand, the simplicity should not be enforced
at the cost of integration quality or the functionality of integrated systems.
The main effort should be put into making the integration process as simple as
possible and limiting the changes that need to be done in integrated applications.
While, at the same time, the desired functionality that has to be achieved and
the overall quality of the integration solution should be maintained. However, it
is worth keeping in mind that sometimes it is better to choose a more complex
solution in exchange of a higher quality and flexibility of the final result.
CHAPTER 2. INTEGRATION STYLES 19

2.4 Integration technology


The market of integration technologies is constantly evolving, new products and
standards are being introduced all the time, thus choosing the most up-to-date
technology can be tempting, but it is not always the best thing to do. The costs
of introducing a new technology must be considered in the first place. Products
featuring state-of-the-art technologies are not always the ones that suit needs in
the best way. What is more, deploying a product using elaborate technology into
a company requires additional time to master it and gain some experience. Time
spent on training consumes additional resources and increases overall cost of the
project. This extra time might not always be available. Also, the failure risk rises
when the development team have to use the technology they are not experienced
with. Furthermore, new solution that has just emerged can be unstable, not
tested sufficiently, inefficient, unreliable, etc. Therefore, introducing the newest
technologies into the solution can not only significantly raise the overall cost of
the integration, but also - what is more important - raise the risk of integration
failure. All those things mentioned above should be taken into consideration
when choosing what technologies to use to perform integration.

2.5 Data format


Integrated applications need to exchange data in order to perform their activities,
data that will be legible to all of the participating applications. To fulfil this
requirement unified a data format, readable for all of the integrated applications,
must be set. This can be achieved by formatting data to one common format
within the integrated applications or by using a component (translator), which
will translate data to an appropriate format and ensure it is readable by the
data receiver. The first option may be considered as a worse one, because it
requires more code modifications within the integrated applications, so it may
collide with the Integration Simplicity criteria. When choosing the common data
exchange format, it should also be taken into account that the nature and format
of data produced by the integrated applications might change over time and
that those changes will affect the integration solution. This means that the
transformation of this data to the common data format will also change. Because
of this, an integration solution should be flexible and able to adapt to data format
changes within integrated applications.

2.6 Data timeliness


Another factor that has to be considered when choosing an integration technique
is time. In particular, the time period from the moment data is being published
to the moment when published data is being consumed by another application.
This is one of the main factors that has the influence on the overall system per-
formance. An effort should be put into minimising the time needed to exchange
data between applications. The shorter this time period will be, the faster inte-
grated application can receive data, start processing it and return results back to
20 CHAPTER 2. INTEGRATION STYLES

the data sender. In the case of the synchronous communication scenario, short
data exchange time is essential to prevent delays in the application processing.
Moreover, long delays may cause another problem. The data, while being
transferred to the destination, may become stale. In this situation the processing
of this data can lead to errors that may have serious consequences for the business.
This issue, named the data timeliness issue, is especially important in case of
applications that deal with volatile data, which changes very frequently in a
short periods of time. Delays are not the only threat connected with this issue.
A so called deadlock situation, may also occur when the sender is waiting to
receive the result of the processing from the receiver and the receiver application
is out of order, because it had just crashed or because of other reason, it cannot
currently process the sender’s request. In that case the whole system is suspended
and cannot perform its activities.
The mentioned issues make the integration process even more complex and
should be taken into consideration, when choosing the most appropriate inte-
gration technique. The solution, which in given circumstances will provide the
shortest latency, should be chosen. This should prevent communication and pro-
cessing deadlocks and errors caused by the stale data.

2.7 Data or functionality


When deciding which integration technique to use, another thing should also be
considered. It must be decided if the applications will share data or functional-
ity. Sharing of functionalists makes an integration more complicated than just
sharing of data. It is more difficult to design and implement and has a signif-
icant impact on the integration process. It requires different approaches and
techniques than data sharing and is harder to achieve. The difference between
those two approaches has been described later on in this chapter, in the section
”Shared Database” (2.9.2) and ”Remote Procedure Invocation” (2.9.3) of ”Styles
of integration”.

2.8 Asynchronicity
Another issue, which should be thought through while designing integration so-
lution is the way, in which the integrated applications will communicate. There
are two possibilities:

• synchronous communication
• asynchronous communication

In a synchronous communication scenario an application invokes the func-


tionality of the remote system and waits until the remote system will process the
request and return results. After receiving the results it continues to perform its
activities [2].
In case of an asynchronous communication scenario an application continues
to perform its activities right after sending a request and does not wait for the
CHAPTER 2. INTEGRATION STYLES 21

results from the remote system. When the results finally come the sender system
is notified then it postpones its current activities and processes the response from
the remote system. This type of call - the sender does not need the results of
the request and it is the asynchronous call - is named ”Fire and Forget” and is
a design pattern [5] used in the application integration.
Synchronous communication is simpler to design and implement, but reduces
the overall performance of the system. Also, it may cause deadlocks (see the
above section ”Data timeliness” (2.6)) and increase time spent by the system in
the idle state - waiting for the response from the other system. Asynchronous
communication, on the other hand, is more effective and offers a greater perfor-
mance at the cost of greater design and implementation complexity.
Synchronous communication is more suitable in cases when there is no need
to create a complex solution, when messages being sent are small and the process-
ing time is negligible. In that case the delays caused by waiting for the response
will not affect the overall performance of the application in a noticeable degree.
In other cases when an application needs to maintain the request-response inter-
action model in order to provide desired functionality (e.g. web browsers, online
chats, etc.), a synchronous model of communication is also necessary.
Asynchronous communication can be used when the sender does not expect
the response to arrive right away. Amazingly, this situation occurs very frequently
in the real life. For example, after filling in the form for a VISA, we do not expect
the embassy to examine our application before we leave the building. Also, we
are not waiting inside the embassy for the decision. Instead, we can continue with
our lives. Similarly, after sending a letter in the post office, we do not expect the
post office to deliver the letter before we leave. Those analogies are very similar
to the working of the asynchronous model. They are proving that this method of
communication is very popular in the real life, therefore it also has to be available
in the computer software.
Asynchronous communication might be more efficient, especially when there
is no place for delays, caused by the waiting for a reply of the request. An asyn-
chronous application instead of waiting for the response, as it is in case of a syn-
chronous one, might continue its processing. Although, it requires solving some
additional issues, such as the ability to process the data received in a response
for the request sent earlier, what complicate the design of an asynchronous ap-
plication, it may significantly improve its performance.
Moreover, the asynchronous communication is more reliable than the syn-
chronous one. Because the asynchronous model usually involves the usage of
queues, which can store persistently every received message, it guarantees that
no message will ever be lost. Even if the receiver system is currently not operat-
ing, the queue will store all message designated for it, and when the system will
be online again it will fetch all of them.
Another advantage of the asynchronous model is the fact that it enables to
create systems more resistant to high-loads. The difference between those two
models is the way they behave during high traffic. In such a situation a syn-
chronous application would not be able to provide a service to all clients, some
of them would get an error, some of them would get no response at all, finally
even the whole application could become inaccessible. An asynchronous applica-
22 CHAPTER 2. INTEGRATION STYLES

feature synchronous asynchronous


efficiency - +
reliability - +
resistance to communication errors - +
resistance to high-loads - +
design & implement difficulty + -
Table 2.1: Trade-offs between synchronous and asynchronous model

tion, on the other hand, would statistically process each request longer, because
it would have a lot of requests waiting in the queue, but sooner or later each of
them would be processed, no request would be left without a response.
The trade-offs between synchronous and asynchronous model has been summed
up in the table 2.1.

2.9 Styles of integration


There are a few different styles of integration available to the developers faced
with the problem of integrating computer systems. They vary depending on the
complication level, difficulty of design and implementation and so on. Criteria
presented in the previous sections can be helpful in determining which of those
styles to choose to solve an integration problem. Each of those styles addresses
some criteria better then the rest of them. Integration approaches can be grouped
into four main types. Those integration styles are as follows:

• File Transfer
• Shared Database
• Remote Procedure Invocation
• Messaging

Each of those techniques has been developed to handle the same task - the
application integration. Although, the task remains the same, the approach rep-
resented by each of them is different. Every one of them is more sophisticated
than its predecessor (e.g. the Shared Database is more complicated solution than
the File Transfer). When faced with the application integration task the point is
not to use the same technique in all cases, but to be flexible and basing on the
criteria described above choose the most suitable style for a given task. More
than one style can be used to achieve the best final result. As mentioned before,
Messaging would be the style on which this thesis will concentrate, but other
styles will also be briefly described to give a wider scope of possibilities at hand.

2.9.1 File Transfer


The File Transfer (Figure 2.1) is the simplest integration style. The main
idea behind this technique is to use files as a data transfer mechanism between
CHAPTER 2. INTEGRATION STYLES 23

Figure 2.1: File Transfer style


(source: Enterprise Integration Patterns [7])

applications. Because files exists on every operating system and almost every
programming language has files operations, it makes them very universal solution
for the purpose of information exchange. Also, as using files does not require any
additional integration tools and as they are already available, they might seem
to be an obvious solution, but they do have lot of disadvantages.
What is required in order to integrate application using this technique is
an agreement on the format of the file used to exchange data. In most cases two
special components (Figure 2.1), usually designed by the integration team, are
created for that purpose:

• Export - component responsible for putting the data from the Applica-
tion A into the file (according to the file format)
• Import - component responsible for reading and parsing the data from the
file (according to the file format) and inserting that data into the Applica-
tion B

Apart from the file format another arrangements must be made. Naming
convention for the files has to be agreed, so that the file names remain unique (it
should be impossible for two separate files to have the same names). This is very
important in order to avoid name conflicts and situations, when the file with old
data would be processed.
Another important issue is to decide when the files will be written and read.
Creating and processing such a file too often will burden an application unnec-
essarily. Usually some fixed time periods are set based on the business activity
cycles, e.g. files can be created on daily or weakly basis. Basing on those time
periods the recipient application (Application B) checks if there is a new file
available to process. If too large time period is set then the application could be
desynchronised and errors in data processing could arise, because by the time the
data in a shared file would be consumed by the second application, they could
become stale and processing them might lead to errors.
When a file is created and data is being written to it, the lock mechanism
must be set in order to make sure that the other application is not trying to
24 CHAPTER 2. INTEGRATION STYLES

Figure 2.2: Shared Database style


(source: Enterprise Integration Patterns [7])

access the same file. This issue is also very important and should be taken care
of in order to prevent errors while reading data from a file (e.g. unexpected end
of a file).
Application using the File Transfer technique can be modified without af-
fecting each other, because it has components, responsible for the export and
the import, separated from the application itself. Also because they only need
to access the file containing exchanged data, no knowledge about the internal
processing performed in each of them is required (such as method return types,
method names, number and types of parameters passed to method and so on).
One of the main disadvantages of that technique is the fact that data is being
synchronised in a batch mode. There might be situations when data processed by
Application B is no longer valid, but because synchronisation is taking place not
in realtime but in time periods, Application B is not aware of that fact, until the
next synchronisation process. This excludes this type of integration in certain
situations, e.g. checking current bank account balance.

2.9.2 Shared Database


The Shared Database (Figure 2.2) is another integration style. It is a more
complex and sophisticated approach than the File Transfer. The idea of this
approach is to use a central data store that all applications share and can access
at any time.
This approach overcomes the main drawback of the File Transfer style, i.e. the
lack of timeliness (data is not being available at proper time due to the fact that
is being exchanged in a batch mode). As explained before, files with data are
created repeatedly at some fixed amount of time, so this solution is not applicable
CHAPTER 2. INTEGRATION STYLES 25

in situations where data has to be propagated to all applications in realtime.


Shared Database style does not have this disadvantage.
Moreover, in case when each application changes shared data very frequently
(couple of times per second) the usage of File Transfer would be very inefficient
and would lead to many problems. If integrated applications use the shared
database then even frequent data changes are not an issue, because data changed
by one system is instantly available to the others.
Database engine is used to handle transaction issues in order to prevent any
deadlocks and data inconsistencies. When an error in data appears it is easier to
rollback a transaction than to return the application to the state prior processing
of a file in the File Transfer integration style. Moreover, small but frequent
changes make it much easier to fix possible errors, without loosing too much of
the processed information, than in the case of one huge daily or weekly based
update.
Almost every programming language gives the means to work with the rela-
tional database using SQL queries. Also, there are a lot of tools available on the
market that make working with the databases simple and effective. The above
arguments might be considered as advantages of using a Shared Database, but
there are are also disadvantages of this solution.
When using a single shared database all of the integrated applications have
to produce data compatible with database schema. In order to achieve that they
must be modified. Designing the database schema, in this case, is the most
difficult part of the whole integration. All participating departments, companies,
etc. have to agree on one common schema. This can be a very challenging task,
especially when each of the participants wants to save some parts of their own
schema, the one they got used to.
After the schema is set the existing data must be transformed, without a loss
of any information and loaded into the new database. Data transformation can
also be a very complex process, because in some cases the existing data has to be
heavily altered before it will become compatible with the new schema. Moreover,
each of the participants has to have a very good knowledge about the model of
the data used by his/her system. As experience shows, this is not always the case,
because some systems might have been created many years ago, there might be
systems without documentation and so on.
To sum it up, changes within applications can be made and they will not
effect the integration solution as long as the output data of the application is
compatible with the database schema.
Apart from advantages that come from overcoming the drawbacks of the pre-
vious approach, this integration style has also some disadvantages, which are
worth mentioning. The usage of one shared database by multiple systems can
cause some serious performance issues or even lead to deadlocks. When two
or more applications are trying to modify the same data simultaneously, a dead-
lock might occur because one application will place a lock on the accessed data
that will prevent other applications from modifying it at the same time.
Moreover, having one database used by all the applications incorporates single-
point-of-failure. If the database is not operating then none of the applications
will be able to perform their work.
26 CHAPTER 2. INTEGRATION STYLES

Figure 2.3: Remote Procedure Invocation style


(source: Enterprise Integration Patterns [7])

2.9.3 Remote Procedure Invocation


The Remote Procedure Invocation (Figure 2.3) represents yet a more so-
phisticated approach to the integration problem than the two previous styles.
Similarly, as in the case of shared database, the increased level of sophistication
enables to overcome the drawbacks of the two previous approaches.
The Remote Procedure Invocation is a mechanism that allows one application
to invoke the method in the context of another - remote - application. Along
with the invocation of an appropriate method all of the required information
is being passed on. The remote party returns the result after processing the
invocation [8]. The result might be as simple as a Boolean value, indicating
whether the operation was successful or not or as complex as data structure
containing the information about the customer.
In this approach, when one application needs data owned by another applica-
tion it makes a direct call to that application. If data needs to be modified it is
also done by a direct call to the application that owns this data. Each application
manages its own data. No data is being duplicated in multiple systems, as it is in
case of a File Transfer. Also, there is no need to change integrated applications in
case the data, managed by one of them, changes. If there is a need to introduce
a new way of data processing than a new remote method has to be implemented.
Although this approach requires the integration team to agree, in advance, on the
names of the methods that would be available for remote invocations, what kind
of data would be passed with an invocation, what information would be returned
and so on.
Prior invoking remote method some knowledge about the other party is re-
quired:

• number and types of arguments

• type of the result


CHAPTER 2. INTEGRATION STYLES 27

Figure 2.4: Messaging style


(source: Enterprise Integration Patterns [7])

• how error situations are handled: is the exception being thrown or some
negative value is being returned?

There are many existing implementations of Remote Procedure Invocation,


just to name a few - CORBA, DCOM, .NET Remoting and JAVA RMI. As men-
tioned in the previous part of this paper covering loose coupling, those implemen-
tations use the same syntax and semantics as the local method calls. This make
it easier for the developers to use those solutions but this similarity can become
an issue if there is no understanding of the differences between the remote and
local calls. In case of the lack of this understanding the implemented solutions
can be slow and unreliable [10]. The Remote Procedure Invocation can also cause
another problem. Using this technique results in tightening the coupling between
integrated applications. Although, they do not share the common data storage
(as in the case of File Transfer or Shared Database) the methods called remotely
cannot be changed without affecting the integrated applications, i.e. the num-
ber of parameters or the return type of the remote method cannot be changed
without modifying the systems calling those methods.

2.9.4 Messaging
The Messaging (Figure 2.4) is considered as the integration style involving the
smallest amount of assumptions about other parties and hence the most promising
for performing well in the integration task. Despite that fact, this is the most
sophisticated technique that can be used to solve the integration problem. It can
be said that this approach combines the features of the previous styles. Just like
the File Transfer it allows the applications to be loosely coupled (sent messages
can be transformed in order to comply with the format expected by the receiver,
without the sender and the receiver being aware of the transformation itself),
but it is also free of its weakness, i.e. high frequency of changes does not cause
desynchronisation of the integrated applications and processing of stale data by
one of them.
28 CHAPTER 2. INTEGRATION STYLES

The Messaging enables quicker data exchange and collaboration between inte-
grated applications. In contrary to shared database approach it does not couple
applications to one database. The Shared Database also does not handle well
with very frequent data changes, especially if the data is being shared between
applications placed in different locations, while Messaging is free of this problem.
The usage of Remote Procedure Invocations forces to make many assumptions
about applications and as a result couples them tightly. What is more, the
semantics and syntax of those invocations can be misleading, i.e. causing the
developer to think about remote invocations in the same way as he/she thinks
about local invocations. That way of thinking may lead to slow and ineffective
solutions. Messaging gives the means to transfer data in a quick and efficient way
(large number of small data units), with the receiver application being notified
automatically if there is another data waiting for the processing.
Messaging also provides a retry mechanism in order to assure the delivery of
the sent data. Applications integrated using this technique have no need to use
the same unified data structure and are not forced to make so many assumptions
about each other as in the case of the Remote Procedure Invocation. Messaging
also offers asynchronous data transfer, which means that the sender does not
have to wait for the results in order to continue its processing. It also does not
require both systems to be operational in order to pass data from the sender to
the receiver. More about the asynchronous method of communication can be
found in one of the previous sections called ”Asynchronicity” (2.8).
Chapter 3

Messaging based systems

Messaging was one of the integration techniques that had been briefly described
in the previous chapter. In the current chapter this description will be broad-
ened and detailed. As mentioned before, messaging is the most sophisticated
integration style that in exchange of high complication provides high decoupling,
asynchronous communication between integrated applications and other features
that make this solution the most flexible among all the described in the previous
chapter.
The previous chapter covered messaging in comparison to the remaining three
other integration techniques. This chapter will describe the concept of messaging,
key terms connected with this topic and the mechanism by which the messaging
based solutions work. Before going deeper into the description of this technique
an understanding of the basic messaging terms and concepts — such as chan-
nel, message, routing, transformation, endpoint, synchronous and asynchronous
communication — should be perceived.

3.1 Message
In order to transmit the data, it first must be marshaled by the sender into a byte
form and then unmarshaled by the receiver so that the receiver has its own local
copy of it. During the transmission data is being wrapped into a Message (Fig-
ure 3.1). Each Message forms an undividable entity, it cannot be split into parts
or divided. It is the data record that can be transmitted and read by the mes-
saging system. In order to communicate the sender’s application must transform
data that is being transmitted into one or more messages and then send those
messages to the receiver. The receiver gathers these messages, extracts the data
from them, merges them if the data have been split into more than one Message,
and finally processes it. Messaging solutions guarantee delivery of the message
to the receiver (it can be repeatedly transmitted from the sender to the receiver
until the transmission will succeed).
A message is the smallest undividable portion of data exchanged between

29
30 CHAPTER 3. MESSAGING BASED SYSTEMS

Figure 3.1: Design Pattern Message


(source: Enterprise Integration Patterns [7])

integrated applications. It consists of two parts:

• Header — contains information used by the messaging system to describe


the data being transmitted, information about the sender, receiver and so
on

• Body — contains the data being transmitted, usually this part of the
message is treated as a black box by the messaging systems and sent between
the sender and the receiver as it is

Moreover, the message payload might contain special, separated section called
Properties, which contain a list of key-value pairs, defined by the sender of
a message.
The messaging system does not differentiate types of messages being sent.
The programmer can choose among different types of messages that can be sent.
Those types are as follows:

• Command Message — used to invoke a procedure on the receiver’s ma-


chine

• Document Message — used to pass set of data to the receiver’s machine

• Event Message — used to notify the receiver about some event that has
occurred on the sender’s machine

• Response-Reply Message — used to send a message, which requires


a response from the sender

• Message Sequence — used to send data using multiple messages

The concept of sending a stream of data divided into discrete parts is not
only used in messaging systems. It is also applied in the network protocols,
where data is grouped into discrete units of data, i.e. datagrams/packets in case
of the Internet Protocol (IP) and segments in case of the Transmission Control
Protocol (TCP).
CHAPTER 3. MESSAGING BASED SYSTEMS 31

3.2 Message Implementations


The concept of a Message is used in different implementations of integration
solutions: Java JMS, .NET Messaging, SOAP. Here are some brief information
about each of those solutions:

1. JAVA JMS

• Message is represented by the class Message.


• Message consists of header, properties and body.
• There are different types of messages depending on the message body,
header remains the same for all types:
– Text Message
Message body contains the String object, which might the content
of a text or XML file or just a text. This is the most common
type of the message. To get the message content the method
textMessage.getText() is provided.
– Bytes Message
Message body contains a simple array of bytes. This is the sim-
plest and the most universal message type. To get the message
content the method bytesMessage.getBytes(array) is provided,
it copies the content of the message to array of bytes passed as
an argument.
– Object Message
Message body contains a Java object that implements
java.io.Serializable interface, so that it can be marshalled and
unmarshalled. Method objectMessage.getObject() returns the
serializable object containing message’s data.
– Stream Message
Message body contains the stream of Java primitives. In order to
read data from the message body methods such as readBoolean(),
readChar(), readDouble(), etc. are provided.
– Map Message
Message body contains a list of key-value pairs, just like
java.util.Map with String objects used as keys. To get the value of
some key the method getTYPE(KEY) is provided, where TYPE is
value’s type and KEY is the name of the key,
e.g. getBolean("isEnabled"), getInt("numberOfItems").

2. .NET Messaging

• Message is represented by the class Message.


• Message class has the following properties:
– Body — contains an Object that represents the message content
– BodyStream — stores a content of the message as a stream
32 CHAPTER 3. MESSAGING BASED SYSTEMS

Figure 3.2: Design Pattern Channel


(source: Enterprise Integration Patterns [7])

– BodyType — specifies the type of data being sent in the message


body (string, date, number, currency, etc.)

3. SOAP

• Message is represented as an XML document, it contains an optional


header and a required body.
• An XML document is an atomic data record that can be transmitted.
• SOAP messages can be transmitted using messaging systems (in that
case the message contains SAOP message as its body).

3.3 Message Channel


The Message Channel (Figure 3.2) is used to transmit Messages between appli-
cations. It is a logical address of the message destination in a messaging system.
It can be imagined as a pipe connecting sender and receiver inside which messages
flow.
Channels must be defined by the integration team and added to the messag-
ing system. Newly deployed messaging system does not contain any channels.
They must be created so that applications can communicate using them.
Message channels have very useful functionality — if the receiver is currently
not available, a message channel will store messages until the receiver will be up
again and will fetch them from the message channel. This feature eliminates the
need for both sender and receiver systems to be on-line in order to communicate.
Sending application knows what kind of information it is sending and basing on
this knowledge, it also knows what channel this information should be sent to.
It does not have to know which application needs this information. It is sufficient
to know that placing the information in a given channel will assure that it will
be delivered to the application that needs it.
The way the channels are implemented varies among different products, but
in order to simplify the process of integrating applications, every messaging sys-
tem provides an Application Programming Interface (API). The API contains
methods for sending and receiving messages, which hides the details of the com-
munication with the messaging system. The application does not have to know
how the connection to the messaging system is set up, how it is reinitialized in
CHAPTER 3. MESSAGING BASED SYSTEMS 33

Figure 3.3: Design Pattern Router


(source: Enterprise Integration Patterns [7])

case of an communication error, how the message is being converted into a stream
of bytes and so on.
There are two types of channels:

• Point-to-Point Channel — directly connects two applications. Data sent


through this channel by the sender will be only available to the receiver.
Once the receiver will fetch the message from the channel it will be deleted
and no longer available.

• Publish-Subscribe Channel — allows sender to send messages through


the channel to more than one receiver (subscriber). Sender sends data into
the channel, then — independently of the sender — each of the subscribers
periodically checks whether there are any new messages pending in the
channel or, another scenario, the subscriber might also be automatically
notified about new messages by the messaging system. Then the receiver
fetches them from the channel.

The above division is based on the way the messages are being distributed
from the sender to the receiver. Other division is based on the purpose of the
message channel:

• Datatype Channels — used to avoid confusion when different datatypes


are mixed within the same channel. Each type of data being sent has
assigned a different Datatype Channel.

• Invalid Message Channel — used to send error messages to the sender


and to provide feedback from the receiver in case of data errors or other
failures.

• Dead Letter Channel (Dead Message Queue) — used by the messag-


ing system when the message cannot be delivered to the receiver.

3.4 Message Routing


In the simplest case of integration solution, systems are connected directly through
Message Channels. Those connections are straight links from the sender to the
receiver. A Message Channel decouples the sender from the receiver of the mes-
sage. Because of this it is possible to have more then one application sending
34 CHAPTER 3. MESSAGING BASED SYSTEMS

messages using the same Message Channel. Quite often it is necessary to perform
some processing of the sent message before it will be directed to its destination.
Messages sent by a single sender may require different processing while being
sent through the Message Channel. Different processing can be required depend-
ing on the message origin, business rules, message type or some other criteria.
In order to assure this, each Filter component connected to the channel has to
know those rules. However, if the rules change, then all of the components within
the Message Channel also have to be changed so that they would have updated
rules. This would make any changes to existing solution very time consuming
and ineffective, both time and performance like. Very often the components that
would be used to determine the further processing of the message could not be
changed because it would be too expensive, time consuming or even impossible.
Moreover, in order to determine the further processing of the message
(e.g. state if the message is destined for this component or not using business
rules based on the message content) the component has to fetch the message
from the Message Channel. But after the message has been consumed, it cannot
just be put back to the channel the same as it was before, because the messaging
system does not enable that.
In order to solve the problem of redirecting, the message depending on a set
of conditions without involving all components participating in the message pro-
cessing a new type of component has been introduced into messaging solutions.
This component is called a Message Router (Figure 3.3). The role of a router
is to decide where the particular message should be delivered basing on a set of
defined business rules.
Other components using messaging system are not aware of the router’s exis-
tence, because it does not change message content, it only redirects messages to
the proper channel. If the need to change the decision rules will arise, then only
the router component has to be changed, other components remain unchanged.
A router is a single point where the decision concerning further message trav-
elling path is being made, therefore in case of heavy traffic the routing component
might become a system bottleneck, but the likelihood of such a situation might
be significantly decreased by using several parallel routing components or by
improving the hardware used to run the system.
The Message Router needs to know the full list of possible message recipients
along with rules that govern the routing process. The alternate solution, that
can be used in case of frequently changing list of final recipients, is to let each
of the recipient to decide whether to fetch the message from the queue or not.
This alternative solution can be build by using Publish-Subscribe channels and
Message Filters, it is called reactive filtering, while using a routing component is
called proactive routing.
There are a few possible variants of a Message Router that can be used in
integration solution:

• Fixed Router
This is the simplest variant. In this variant the router has one input and
one output channel defined. It does not perform routing as such, but is
used to decouple systems or pass messages between different integration
CHAPTER 3. MESSAGING BASED SYSTEMS 35

Figure 3.4: Design Pattern Translator


(source: Enterprise Integration Patterns [7])

solutions. Most often this type of routers are used combined with a Message
Translator or a Message Adapter in order to pass the message between
different integration solutions or different types of message channels.
• Content-Based Router
This type of routers use the properties of the message such as, for example,
the type of the message or the values of the specified message fields in order
to determine the message destination. It is the most commonly used router
type.
• Context-Based Router
This type of routers use the information about the surrounding environment
to determine the message destination. Those routers can be used to perform
load balancing or change the message destination if the original recipient
is not responding. Context-Based Routers can be used to increase the
flexibility and reliability of the system in case of unexpected errors.

Routers can also be divided into two other groups: stateless and stateful.
In the case of the first group, a stateless router only considers the message that
it had just received and makes the routing decision basing on only single —
current — message. A Stateless router, on the other hand, in order to determine
an incoming message destination also takes also into account previous messages.
This feature might be used to remove duplicated messages, for example.

3.5 Message Transformation


The concepts covered so far in this chapter concerned the way the message is being
sent from one system to another. What has been omitted is the fact that in order
to communicate, the content of the message has to be properly understood by
the receiver. Message data has to be properly interpreted and used to perform
necessary operations, in order to ensure that it must be delivered to the receiver’s
application in the correct format. In the ideal solution both sender and receiver
would use the same data format, but this situation is very rare. In most cases
both sender and receiver would have to be modified so that they would use the
same data format.
This approach raises many problems to solve. Which data format should be
used, the one used by the sender or the one used by the receiver? What kind of
selection criteria should be used to make that choice? Moreover, making internal
36 CHAPTER 3. MESSAGING BASED SYSTEMS

changes to the integrated applications can be very difficult or in some cases even
impossible. It may also cause some changes to the internal business logic of
the application, which is an undesired situation, because integrated applications
should be unaffected by the integration process as much as possible. Making
such changes would also neglect the idea of loose coupling described earlier (1.1).
After implementing that kind of change into both applications, they would not be
loosely coupled anymore. The change in data format in one of them would have
to be reflected immediately in the other one, otherwise the integration solution
would not work as it was intended to.
The simplest way to ensure that the data format of the arriving message
will correspond to the internal data format of the receiver’s application is to use
a separate component, which will changethe message body to the appropriate
format. This component is called a Message Translator or a Message Trans-
former (Figure 3.4). The usage of this component enables to preserve the loose
coupling between applications. In the case of a change of internal data format
within any of the integrated applications only the changes in the component per-
forming transformation are necessary, the applications will remain unaffected.
This way they do not depend on each other, and changes made in one of them
do not enforce to make changes in the others.
The transformation process itself can take place on many different levels of
data representation. It may refer to the name of the data fields, data represen-
tation in those fields, data structure as a whole (different ways of representing
the data) and so on. Hohpe and Woolf [7] makes a division of different levels of
data transformation and organises them in a similar form as the ISO/OSI model.
This division is presented in the table 3.1.
As it is being presented in the above table, the levels of transformations are
divided into four layers:

• Transport
Transformations performed in the scope of the communication protocols
(Transport Layer) enables data transfer between systems using different
communication protocols and ensures reliable message transfer between
those systems.

• Data Representation
The Data Representation layer performs the transformation concerning the
representation of the data. Transformation within Transport layer operates
on the stream of bytes, while the Data Representation transformation op-
erates on the data representation (e.g. it changes the XML representation
into name-value representation).

• Data Types
The Data Type layer performs the conversion of the data contained in
the message. The conversions includes changing field names, changing data
types of those fields, combining data from multiple fields into one or splitting
data from one field into two and so on. The goal of this transformation is
to make data comply with the data model of the receiver’s application.
CHAPTER 3. MESSAGING BASED SYSTEMS 37

Layer Deals With Transformation Tools/Techniques


Needs (Example)
Data Entities, Condense Structural
Structures associations, many-to-many mapping
(Applica- cardinality relationship into patterns, custom
tion aggregation code
Layer)
Data Types Field names, data Convert ZIP code EAI visual
types, value from numeric to transformation
domains, string. Concatenate editors, XSL,
constraints, code First Name and database
values Last Name fields to lookups, custom
single Name field. code
Replace U.S. state
name with
two-character code.
Data Rep- Data formats Parse data XML parsers,
resentation (XML, name-value representation and EAI
pairs, fixed-length render in a different parser/renderer
data fields, EAI format tools, custom
vendor formats, Decrypt/encrypt as APIs
etc.), Character necessary
sets (ASCII,
UniCode,
EBCDIC), Encryp-
tion/compression
Transport Communications Move data across Channel
protocols: TCP/IP protocols without Adapter, EAI
sockets, HTTP, affecting message adapters
SOAP, JMS, content.
TIBCO
RendezVous

Table 3.1: Levels of data transformation

• Data Structures (Application Layer)


Finally, transformations within the Application Layer defines the entities,
which are used in the application data model and sets the relations between
them (e.g. Can customer have multiple bank accounts? Can bank account
have multiple owners?).

As each transformation can be performed in a separate component, the trans-


formations themselves can be chained. Chaining transformations enables to per-
form complex changes to the transmitted data on different levels. It also enables
the designers of an integration solution to combine different transformations into
transformation chains to achieve desired effect in the most efficient way.
38 CHAPTER 3. MESSAGING BASED SYSTEMS

Figure 3.5: Design Pattern Endpoint


(source: Enterprise Integration Patterns [7])

3.6 Message Endpoints


Up to this point the main massaging concepts concerning the message transporta-
tion have been covered. Several topics connected with the message transportation
have been described. Those topics cover issues such as the way the data is being
sent between the applications using the messaging system, what are the basic
units of exchanged data, how does the messaging system cope with sending them
to the correct destination system, and finally, how the data is being transformed
so that it fits the internal data model of the destination system.
The Messaging system in most cases is a separate application responsible for
dealing with all the issues mentioned above, but not only with them. The sys-
tems that are supposed to be integrated using the messaging application are also
separate business entities. What must be taken into account is that integrated
applications, in most cases, are complex systems, often few or more years old,
usually older than the messaging system itself. Quite often, they do not have
the means to communicate with the messaging system — to send messages to
the messaging channels, to fetch incoming messages, to create messages and so
on. They also cannot be changed or modified to gain those abilities, because the
modification might be too expensive, impossible for technical or some other rea-
sons. Thus, in order to make the messaging solution work, one more component
is needed.
Component, which connects the application with the messaging system, is
called a Message Endpoint (Figure 3.5). It is a custom made component that
performs all the operations mentioned above and enables integrated application
to cooperate effectively with the messaging system. It is responsible for creating
messages, sending them to the message channels, fetching incoming messages,
wrapping data from the application into the appropriate message format, and
unwrapping incoming messages to extract the data and pass it to the application.
The Message Endpoint encapsulates the messaging system’s Application Pro-
gramming Interface (API) from the integrated applications, which are not aware
of all the operations being performed when they raise a request to send data to
another application. Because the endpoint is written to work with a particular
application and messaging system it has to be rewritten if it would have to coop-
erate with different applications or messaging system, than it has been designed
for. One instance of the Endpoint can either send or receive messages from the
channel, it cannot do both at the same time. This implies that an application
can have several Message Endpoints attached to it.
CHAPTER 3. MESSAGING BASED SYSTEMS 39

Figure 3.6: Overview of a communication based on message design patterns


(source: Enterprise Integration Patterns [7])

The figure 3.6 summaries all of the terms explained so far. The application on
the left side (Application A) wants to sends a Message to the Application on the
right side of the above picture (Application B), the steps of that communication
would look as follows:

1. Application A sends a Message using its Message Endpoint.

2. The Message is placed in the Message Channel.

3. The Message is directed to the Router, which decides where the message
should be delivered (let us assume that the message would be delivered as
presented on the picture).

4. The Message is directed to the Translator, which changes the message’s


content format.

5. Finally the Message is delivered to Application B, which using its Message


Endpoint fetches it.

3.7 Synchronous and asynchronous communication


Integration solutions can perform their activities in two possible ways. The term
”ways” can be interpreted in this particular case as the overall mechanism by
which the system operates. Those mechanisms can be of two kinds: synchronous
and asynchronous — two ways by which integrated system can communicate
with each other using the messaging system. Those concepts have already been
mentioned earlier in the part covering the integration criteria (2.8), but now we
will try to take a more detailed look at them.
The reliability of communication is a very important issue both in the syn-
chronous and asynchronous approach. Some communication errors are beyond
the scope of messaging solution (e.g. physical media error, hardware failure, etc.)
but they might prevent messaging system from delivering the message to the
40 CHAPTER 3. MESSAGING BASED SYSTEMS

sender. Of course, sender can resend the message after some period of time to
increase the possibility that the receiver will finally receive it. But in the case
of some types of errors this will not assure a successful delivery. This drawback
can also lead to the loss of application performance and processing speed. An
application has to wait until it receives a response instead of performing its usual
activities. In the case of one application accepting requests from several clients
those drops in performance and processing speed can become even greater and
lead to deadlocks.
The main advantage of the synchronous approach is its simplicity. The appli-
cation does not have to use additional resources to monitor the whole processing
as it is in the case of an asynchronous approach.
This simplified approach is sufficient in case of systems that do not send
sophisticated requests that would require a lot of processing from the message
receiver. A short processing time of requests will reduce the time the sender
application has to wait until it receives an answer and as a result, do not affect
the overall performance of the senders in a noticeable way.
Asynchronous communication, in contrary to the synchronous one, is a much
more complicated concept. The main idea behind it is that application after
sending the request does not wait for a response, but continues its processing.
It requires different approach in implementation and design. An application using
an asynchronous model cannot be designed as a sequence of method invocations.
It has to be designed in such a way that the remote functionality will be in-
voked without affecting the main application flow. This enforces different kind
of application design than in the case of synchronous communication. A possible
scenario might look as follows:

1. The sender application (Application A) sends a message (containing mes-


sage identifier) designated to the receiver application (Application B).

2. If the receipt of the message has been confirmed by the messaging sys-
tem, Application A stores the information about the sent message along
with identifier assigned to it in the database and switches to perform other
operations.

3. Application B is notified by the messaging system about a new message


and fetches it.

4. Application B after processing the received message, sends a new message


with the response for Application A request (the response includes message
identifier sent in the request).

5. Application A receives a new message with the response for its previously
sent request.

6. Application A looks up the database for the request identified by the mes-
sage identifier contained in the received message.

7. Application A after gathering all required information, starts its processing.


CHAPTER 3. MESSAGING BASED SYSTEMS 41

This approach despite being more difficult in design and implementation,


as it has been stated in the section ”Asynchronicity” of chapter ”Integration
styles” (2.8), has a lot of advantages over the synchronous one, just to name
a few: reliability, efficiency, resistance to communication errors, resistance to
high-loads.
Both of the presented approaches have their pros and cons. In some cases the
synchronous solution is more advisable — when the processing speed is a priority
and the communication model is simple, with simple request that can be pro-
cessed quickly. In other cases — when the application reliability, error resistance
or complex communication model has the priority — the asynchronous model has
the superiority. Although the asynchronous approach requires a quite different
architecture of the application and is more time and resource consuming, it might
prove to be more applicable and result in a higher reliability and flexibility level
of the final solution.
Chapter 4

Design patterns in the application


integration

Design pattern is a term that has been adopted from the architecture to software
engineering [1], which describes a well-known method of solving commonly occur-
ring problem. In case of computer science design pattern should not be perceived
as a ready-to-use solution, such as a source code, but only as a template that can
be used in multiple situations. In this paragraph the usage of the design patterns
in application integration will be discussed and the term design pattern itself will
be explained in a more detailed way.
Designing and implementing an application can be a very complex and com-
plicated task. Applications may vary on the technologies used, working environ-
ment, performed task, complexity, and so on. But very often the designers would
encounter the same problems to solve. The design patterns are a set of proved
ways of solving those problems. They do not contain a ready solution that can
be put into an application to solve a problem, rather, they are template that can
be used to solve the problem [4].
Design patterns also occur in the messaging systems and application integra-
tion solutions. Some of the design patterns described in this section derive from
the basic concepts of the messaging systems described earlier (3). Each of them
is an answer for particular problem, for example, how to connect two applications
within the messaging system — by using the Message Channel.
As said before, design patterns are only general templates, not a ready to
use solutions. The Message Channel pattern, for example, can be implemented
in different ways, as a Datatype Channel, a Point-to-Point Channel, a Publish-
Subscribe Channel and so on. Each of those channels performs different task,
but all of them are based on the same Message Channel pattern template.
As mentioned before, the Message Channel pattern can be applied in various
ways. The simplest usage of this pattern is the Point-to-Point Channel that
connects two different systems directly. If the aim is to deliver the message to
more than one receiver at the same time, the Publish-Subscribe Channel can be
used. This channel has one input (Publisher) channel and many outputs (Message

43
44 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Endpoints). Each output channel is connected to one system (subscriber). When


a message is published, it is being replicated and sent to each of the output
channels, where it can be consumed by each of the subscribers. The message can
be consumed only once, so after being consumed by every subscriber the message
will be removed from the channel and will not be processed again.
Another type of pattern that can be used is a Message Router. The concept
of the Message Router has been already described as this is one of the basic
concepts connected with systems integration (3.4). The Message Router pattern
is a response to the problem of delivering the message to the correct recipient.
Again as in the case of Message Channel, basing on one pattern, different variants
of Message Routers might be built. Those routers variants can be as follows:

• Content-Based Router
The Content-Based Router (Figure 4.1) reads the message content and bas-
ing on it and encoded routing rules directs message to the proper recipient.

Figure 4.1: Design Pattern Content-Based Router


(source: Enterprise Integration Patterns [7])

• Message Filter
The Message Filter (Figure 4.2) works in a similar way to the Content-Based
Router. It reads message content and checks if it matches the encoded
criteria. If it does, it sends the message further, if not the message is
discarded.

Figure 4.2: Design Pattern Message Filter


(source: Enterprise Integration Patterns [7])

• Dynamic Router
The Dynamic Router (Figure 4.3) is a more flexible variant of the Message
Router. It allows the routing rules to be modified by sending control mes-
sages to the given port of the router. This makes is more flexible then the
router with fixed routing rules and allows to change routing rules dynami-
cally, when such a need arises. It can be useful when a new system is being
CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 45

Figure 4.3: Design Pattern Dynamic Router


(source: Enterprise Integration Patterns [7])

connected to the messaging solution and all routers in the system have to
be updated with new routing rules.

• Recipients List
The Recipients List (Figure 4.4) extends the functionality of the Content-
Based Router. It works in a similar way to the Publish-Subscribe Channel
— inspects the incoming message and basing on the message content it
determines the list of the message recipients, then it forwards the message to
those recipients. The list of recipients may vary depending on the message
content, which also can be specified dynamically.

Figure 4.4: Design Pattern Recipients List


(source: Enterprise Integration Patterns [7])

• Splitter
The Splitter (Figure 4.5) is used when an incoming message contains mul-
tiple elements, which cannot all be processed in the same way. In that case
the message is split into separate elements and each of them is sent inde-
pendently to an appropriate system to be processed. The Splitter produces
one message for each element contained in the incoming message (e.g. if
an incoming message contains order data with a list of ordered items, for
each item from the list a new message will be produced and published to
an appropriate channel).

• Aggregator
The Aggregator (Figure 4.6) works in the opposite way to the Splitter de-
scribed above. The Aggregator receives incoming messages and identifies
the ones that are correlated with each other. When the complete set of
correlated messages has been received, it performs an aggregation of those
messages collects the information from each of the correlated message and
publishes a new — single — message, containing all of the collected infor-
mation.
46 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Figure 4.5: Design Pattern Splitter


(source: Enterprise Integration Patterns [7])

Figure 4.6: Design Pattern Aggregator


(source: Enterprise Integration Patterns [7])

While designing an Aggregator the following things must be set:

1. conditions for correlation (using this conditions the messages will be


classified as correlated with each other)
2. completeness condition (when the set of correlated messages is com-
plete)
3. aggregation algorithm (how to process the information from correlated
messages and publish them into one message).

What is worth mentioning, is that unlike a Content-Based Router, an Ag-


gregator is stateful, i.e. it remembers each of the incoming messages until
the complete set of the correlated messages has been received and processed.
A simple Content-Based Router is stateless — it only processes the incom-
ing message without any regard to the messages processed earlier (it does
not keep information about previously processed messages).

• Routing Slip
The Routing Slip (Figure 4.7) allows to determine the whole processing
path for every message. Each incoming message has a routing slip attached

Figure 4.7: Design Pattern Routing Slip


(source: Enterprise Integration Patterns [7])
CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 47

to it, specifying the sequence of the processing steps for this particular
message. Every processing component is being wrapped in a special router
that reads the routing slip attached to the incoming message and forwards
the message to the next processing step from the routing slip. This way,
a whole processing chain can be composed and managed from one location.
Moreover, the Routing Slip for a new type of messages can be defined if
necessary.

• Process Manager
The Process Manager (Figure 4.8) works in a similar way to the Routing
Slip, although it works in a more dynamic way. It forwards the message
to the first processing unit and basing on the processing results from this
unit and the information about the processing step executed previously it
determines the next processing step. The next step is computed dynami-
cally basing on the processing result and information stored by the process
manager. The processing path is not fixed as in the case of the Routing
Slip but is constructed dynamically by the Process Manager.

Figure 4.8: Design Pattern Process Manager


(source: Enterprise Integration Patterns [7])

• Message Broker
The Message Broker (Figure 4.9) is a central component of the integration
solution. It connects all integrated system. Within its internals it contains
design patterns described before used to effectively route the messages be-
tween the connected systems. It reduces the number of message channels
required to connect the integrated system. If each pair of the integrated
systems, which need to interact with each other, would be connected di-
rectly, the number of required channels would increase to an unmanageable
number. The Message Broker significantly reduces the number of the re-
quired channels and becomes a central component of the system, where all
message routing operations are being performed.

As it might be observed there are a lot of routing components that can be cre-
ated basing on the message router pattern. Each of them response to a different
kind of need and can be used to solve a particular problem. Those components
vary from the simplest ones with fixed routing algorithms to the more compli-
cated that perform routing dynamically basing on the results of the application
processing and dynamically builds a processing path for an incoming message.
Another pattern that is commonly used in the integration solutions is the
Message Translator. As mentioned before, the Message Translator pattern is
48 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Figure 4.9: Design Pattern Message Broker


(source: Enterprise Integration Patterns [7])

used to reformat the data in such a way that it would fit to the internal data
representation model of the other system. Such a need may arise very often as
the systems being integrated usually have different internal data representation
models. In case of this pattern, as well as in the case of the previous ones, the
pattern itself is a base for different variants of the Message Translators that can
be used in various situations depending on the faced problem. The idea of the
Message Translator concept has been already described in the section devoted
to the main concepts of the messaging systems (3.5). Now let us concentrate on
the description of the different variants of translators based on the same Message
Translator pattern:

• Envelope Wrapper
The Envelope Wrapper (Figure 4.10) wraps sent data into an envelope
in such a way that it fits the message format used by the given messaging
system (adds header and body sections, encryption, etc.). After the message
arrives at its destination point it is unwrapped by the unwrapper, which
withdraws any modifications done by the wrapper and passes the data, as
it was initially sent by the sender application, for further processing.

Figure 4.10: Design Pattern Envelope Wrapper


(source: Enterprise Integration Patterns [7])

• Content Enricher
The Content Enricher (Figure 4.11) is used when the destination system
requires more information than the sender can provide. Content Enricher is
able to add additional information to the message fetched from the external
data source. After this step the message is forwarded to the next processing
component.
CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION 49

Figure 4.11: Design Pattern Content Enricher


(source: Enterprise Integration Patterns [7])

Figure 4.12: Design Pattern Content Filter


(source: Enterprise Integration Patterns [7])

• Content Filter
The Content Filter (Figure 4.12) works in the opposite way to the Content
Enricher. When an incoming message contains complex information and
only a small part of that information is required by the message receiver
a Content Filter removes the obsolete data from the message, leaving only
data needed by the message receiver.

• Normalizer
The Normalizer (Figure 4.13) is a combination of the Message Router and
multiple Message Transformers. It is used when integrated systems use
different formats of the messages and when each of those formats requires
a different type of translation in order to fit into the model used by the mes-
saging system. Information can arrive as an XML document, as a plain text
file containing a comma separated data fields, as an Excel file, and so on.
Each of those formats requires different processing in order to transform it
into the format appropriate to the messaging system. When those messages
arrive at the Message Router, they are being forwarded to the appropriate
Message Transformer responsible for dealing with this particular data for-
mat. The range of accepted incoming message formats might be easily
widen by adding a new routing rule to the Message Router and connecting
the Router to the additional Message Transformer by the Message Chan-
nel. This way the integration solution might be dynamically adapted to the
changing business environment and extend its functionality.

The above examples of integration patterns show how a single template can
be used to solve different kinds of challenges of the same nature, in this case,
connecting multiple computer systems.
Single template can become the source for various types of components de-
signed to solve different types of problems. Each of those patterns finds an appli-
50 CHAPTER 4. DESIGN PATTERNS IN THE APPLICATION INTEGRATION

Figure 4.13: Design Pattern Normalizer


(source: Enterprise Integration Patterns [7])

cation in designing an integration solution. Using them helps to overcome some of


the most commonly encountered problems and makes the designed system more
reliable, flexible and easier to support.
Chapter 5

Enterprise Service Bus

Having covered the basic theory of integrating computer systems — styles of


integration, messaging based systems and design patterns connected with them
— let us move forward to the practical usage of the above concepts.
This chapter will describe the integration solution that is currently the most
popular and most frequently used in the process of integration — the Enterprise
Service Bus (ESB). First, definition of this concept will be provided. Later on
a Message Oriented Middleware (MOM), which is another integration solution
that is used as a basis of the ESB concept will be described. After that, the
ESB aims and capabilities and finally the design patterns connected with this
technology will be presented.

5.1 Definition
The Enterprise Service Bus (ESB) is an integration solution that enables
integrating systems in a loose-coupled way. It heavily uses open standards such
as XML and WebServices. It is based on the Message Oriented Middleware,
which provides reliable communication using messages. It simplifies creating
computer systems architecture focused on providing business services — services
that have a meaning to the business, not implementation services, services that
have a meaning to the developers.
Currently there is no formal, industry-agreed upon definition of an Enterprise
Service Bus. A lot of vendors provide products claiming they are ESB solutions,
but there is no precise definition of what such a product should contain. One of
the methods of explaining what an Enterprise Service Bus is, is focusing on the
capabilities that it is able to provide and the advantages of deploying it into the
company. This approach will be taken in this chapter.
According to the Gartner Group an ESB [14] consists of the following four
things:

• Message Oriented Middleware (MOM)

51
52 CHAPTER 5. ENTERPRISE SERVICE BUS

• Web Services
• Intelligent Routing based on content
• XML data transformation
It is worth keeping in mind that the term ESB does not necessarily have to
refer to the software product. It may also beconsidered as:
• a pattern
• an architectural component
• a hardware component (there are devices, which have all the capabilities
required in order to be called as an ESB solution)
One of the following sections named ”ESB components” (5.6) will describe
the meaning of an Enterprise Service Bus as an architectural component.

5.2 Message Oriented Middleware


The Message Oriented Middleware (MOM) is the basis of an Enterprise
Service Bus. The main task of a MOM is to provide:

• reliable transport
• efficient method of communication using messages
• end-to-end reliability

In a typical Remote Procedure Call (Remote Procedure Invocation) based


communication scenario if one of the applications is not available the whole re-
quest cannot be performed. The invoking application has to decide what to do
next, repeat the attempt within couple of minutes or return an error stating that
the request cannot be performed. In the case of the message based communi-
cation Message Oriented Middleware is responsible for delivering the message to
the proper application. The sender application just puts the message into the
queue and leaves the responsibility of delivering it to the MOM [9]. If the receiver
application is not running, the message is left in the queue until the receiver will
be up again and will fetch it.
The left side of the figure (Figure 5.1) presents a typical RPC based com-
munication scenario, while the right side presents message based communication.
Application 1 communicates with Application 2, 3 and 4. Let us assume that
Application 4 is currently unavailable. There might be dozens of reasons for such
a situation — the network connection might be broken, Application 4 may experi-
ence some technical problems or simply it may be down because of a maintenance.
Application 1 must get data from all three applications, but Application 4 is un-
available. Which action should be taken — repeat the attempt within a couple
of minutes or return an error — this decision must be made by the Application
1.
CHAPTER 5. ENTERPRISE SERVICE BUS 53

Figure 5.1: Overview of a communication based on Remote Procedure Call


and Message Oriented Middleware

Let us apply the same scenario to the right side of the figure. Application 5
communicates with Application 6 and 7. Application 5 does not have to worry
about the applications’ accessibility, because it does not communicate directly
with them — it is using its message queues, which are always available. If Ap-
plication 6 or 7 is currently not working, the messages designated for them will
not be dropped, but they will be stored in the queues until the Applications
will be up again. Message Oriented Middleware guarantees that eventually every
message will reach its destination application. Application 5 does not have to be
concerned about it.

5.3 Tightly coupled interfaces


To better illustrate the overpowering number of connections in a tightly coupled
interfaces scenario, implemented for example by the Remote Procedure Invoca-
tion, let us consider the following situation. When every system provides one
interface and is connected with every other system, the total number of connec-
tions might be computed using the formula [3] n(n−1)
2
, where n is the number of
the systems. If there are 5 systems (n = 5), the total number of connections
is 10. That amount of connections is still manageable, but if there are 10 sys-
tems then there are 45 connections and in the case of 100 systems there are 4950
connections! Of course, not every system has to be connected with every other,
but it shows that the growth of that kind of computer software infrastructure is
very limited.

5.4 ESB aims


One of the aims of introducing an ESB is to decouple the client (application
needing an access to the service) from the service provider. The left side
54 CHAPTER 5. ENTERPRISE SERVICE BUS

Figure 5.2: Tightly coupled system architecture compared to


an Enterprise Service Bus architecture

of the figure (Figure 5.2) shows the typical model of interconnected systems.
Component1, in order to connect to Component4, has to know the following:

• Component4 IP address

• Component4 port number

• Component4 protocol and a client of that protocol (EJB)

• Component4 method name and its signature (types of arguments, type of


return argument, throws declaration, etc.)

The same applies for all of the connections on the above picture. Before one
component will be able to connect to another, it has to know a lot about the
other party and also, assume that this information will not change.
The introduction of the ESB, presented on the right side of the figure 5.2,
relieves the client from the need to know, who is providing the service, because
the ESB is responsible for creating a communication channel between the client
and the service providers. That way, the application does not need to have
the integration code, because it will no longer be responsible for creating the
TCP/IP connections, reconnecting in case of a communication error, knowing
connection information (URL, port number) of the service provider and so on.
Thus, introducing an ESB will simplify the design of the client. Furthermore, in
a tightly coupled system, any change of the connection data in one of the service
providers, also requires a change in all of the clients that are using this service
provider. This is no longer true in the case of an ESB, because it is the ESB,
which is responsible for storing that information.
Let us imagine what would happen if the IP address of Component4 would
be changed one day? Then in the case of a tightly-coupled scenario all of the
applications connected directly to the Component4 (i.e. Component1 and Com-
ponent5) will have to be updated. Now, in the case of an ESB scenario only the
configuration hold by the Enterprise Service Bus will have to be updated.
CHAPTER 5. ENTERPRISE SERVICE BUS 55

To sum it up, an ESB provides service location transparency, as the


client no longer has to know the exact location of the service provider — it is the
responsibility of the ESB.
An ESB also enables sharing of services. Once implemented service might
be used in many projects or by clients from various departments. For example,
service providing information about the employee might be used by the systems
from the HR, financial or IT department.
Also, what an ESB enforces is the separation of the business service from
the implementation service. Companies are run by business rules and they
should be seen from the point of the business services they provide. Usually, it
is the developers, who by creating EJB remote interfaces, WebService interfaces
and remote procedure call methods are defining the way, in which the company
is perceived. An Enterprise Service Bus enables to change that. Thanks to being
able to transform business request into an invocation of the particular implemen-
tation services it enables the business — not the developers — to define the way
the company is seen by its business partners and customers. The responsibility of
the business is to know how to run a company — what the company’s input and
output are and what services to provide, while, the responsibility of an ESB is to
know what the IP address and TCP port of the application server are, what the
protocol used by particular system is, what the signature of a particular method
is, etc. If one day, one particular implementation service will be replaced by
a new one, the way the company does business will not change, so that cannot
enforce any modifications on the other — business partner or customer — side,
only the ESB will have to be updated. This enables to decouple the business
model — the way the company works — from the implementation.

5.5 ESB capabilities


This approach — describing an ESB by its capabilities — is common in existing
literature, another example of such list can be found in IBM Redbook concerning
the subject of Enterprise Serivce Bus [12].

1. Message Transformation — ability to transform structure and format of


business request to a service provider
An ESB must be able to change the message format to the format accepted
by the service provider.

2. Message Routing — ability to send a request to a particular service provider


basing on some criteria
An ESB must be able to decide to which service providers a particular
message should be delivered.

3. Security — ability to protect an ESB from unauthorised access


An ESB has to assure than only selected clients have access to particular
services.
56 CHAPTER 5. ENTERPRISE SERVICE BUS

4. Transaction Management — ability to provide single unit of work for a


business service request by providing a framework for the coordination of
multiple resources across multiple services
There is no ESB product that provides a transaction management, because
an ESB connects multiple different IT systems and it is impossible for it
to rollback a distributed transaction after it has been committed by one of
the systems.
Let us see an example of the problem mentioned above, the possible scenario
might look as follows:

a) The ESB receives a message, according to a set of rules it knows that


it has to connect first with System A then with System B.
b) The ESB communicates with System A.
c) System A sends some messages into a Message Channel and returns
information back to the ESB that it has successfully completed that
task.
d) The ESB communicates with System B, but System B is unable to
perform the request because of internal errors, it returns an error.
e) The current situation is that System A has successfully completed
the transaction and System B has failed. The ESB cannot rollback
the operations (sending messages into a Message Channel) taken by
System A, it can only notify System A that this transaction should be
rolled back, but how to handle it is delegated to the System A.

To sum up, no ESB provides transaction management, because it is impos-


sible for an ESB to rollback operations performed by one of the integrated
systems. An ESB can only provide a framework for transaction manage-
ment.

5. Message Enhancement — ability to modify and add information as required


by the service provider
It might be necessary for the ESB to add some additional information
into the message before it will be delivered to the service provider. That
information might come from a database or some other system.

6. Protocol Transformation — ability to accept messages sent using different


protocols, i.e. IIOP, SOAP/HTTP, CORBA
This capability consists of two aspects:

• logical — an ESB must understand the protocol (its semantic and


syntax)
• physical — an ESB must have a component suited for operating using
that protocol, i.e. HTTP server, CORBA client, etc.
CHAPTER 5. ENTERPRISE SERVICE BUS 57

7. Service Mapping — ability to translate a business service into the corre-


sponding service implementation and provide binding and location infor-
mation
It is a mapping performed by an ESB between abstract business service
and implementation service (IP address, port, name of the method, etc.)
8. Message Processing — ability to monitor the state of the received request
For the client, sending a message to the ESB, the most important thing
is that a sent message should never be lost. In order to achieve that the
ESB has to monitor the state of the message — is it already processed by
the service provider, was the processing successful, is the service provider
available, etc.
9. Process Choreography — ability to manage complex business processes that
require the coordination of multiple business services to fulfil a single busi-
ness service request
This functionality enables the client to perceive business requests as one
single request, while in fact its execution may trigger the execution of mul-
tiple business services. It is usually implemented as a Business Process
Execution Language (BPEL), which is a language enabling business pro-
cess modelling [9].
10. Service Orchestration — ability to manage the coordination of multiple
implementation services
The difference between previously described process choreography and ser-
vice orchestration is the type of service being managed — in case of service
orchestration it is an implementation service, in case of process choreogra-
phy it is a business service.
One of the reasons for not having a formal definition of the ESB is that every
application has different requirements for the capabilities provided by an ESB.
For example, in some integration projects a transaction manager or a BPEL
engine might not be required, but there might be a need for message routing
and message transformation functionality. There are products on the market,
which do not have a support for transactions and BPEL, but do have support
for message routing and transformation. They claim themselves to be an ESB
solutions and, what is the most important, they would fulfil requirements of such
an integration project.
As it has been depicted the term ESB has a very broad meaning and it is
being used by the software vendors to describe products providing differential
functionality.

5.6 ESB components


An ESB does not have to refer only to a software product, but can also refer to
an architecture component. This section will be devoted to this meaning of the
term ESB.
58 CHAPTER 5. ENTERPRISE SERVICE BUS

While designing an integration solution an ESB should not be perceived from


the perspective of one particular software and its capabilities, but rather as a com-
ponent, which functionality might be provided by multiple products available on
the market. This approach to an integration does not tie the company deploying
an Enterprise Service Bus to one concrete solution, but enables the company to
have products from various vendors cooperating. One of the conceptual models
of such an ESB architecture is presented on the figure (Figure 5.3). It consists of
the following logical components:

1. Mediator
The Mediator is the most important component in an ESB. The crucial
functionality provided by this component is the routing, communication
and protocol transformation. A product not having the above cannot be
considered as an ESB solution. The Mediator is used as an entry point for
the ESB — messages sent to the ESB are received and processed by this
component. It might also be responsible for message transformation and
enhancement. In order to enable reliable and secure processing of requests
it must support security, error handling and transaction management.

2. Service registry
Service registry is a component that provides the functionality of the service
mapping.

3. Choreographer
The role of the choreographer is to enable process choreography — co-
ordination of business processes. This component is actually a client of
an ESB. It has the knowledge about the sequence of business services that
must be called in order to perform one — sophisticated — business request.
If the mediator decides (according to its rules) that this particular request
needs to be choreographed, it will be forwarded it to this component. The
Choreographer after looking up its configuration will invoke proper service
providers by sending messages — just like an ESB client — to the mediator.

4. Rules engine
The Rules engine is an additional component, which may not be required
in some integration projects. This component enables to have a rule-based
routing. Its functionality includes: message routing, message transforma-
tion, security and transaction management.

5.7 Open source ESB products


Among dozens of ESB products available of the market there are two open source
solutions, each with different characteristics:

• The Mule is a Lightweight Messaging Middleware Framework. It intro-


duces the concept of Universal Messaging Objects (UMO) — just Plain
Old Java Objects, which are responsible for communicating with the service
CHAPTER 5. ENTERPRISE SERVICE BUS 59

Figure 5.3: Conceptual overview of an Enterprise Service Bus components

providers, performing transformation and routing of the messages. They are


deployed in a mule container — a framework enabling the communication
between endpoints.

• The ServiceMix is a JBI-compliant ESB. It consists of the Normalized Mes-


sage Router with a bunch of components responsible for protocol transfor-
mations and ESB capabilities (rules engine, etc.)

5.8 ESB integration patterns


5.8.1 VETO pattern

Figure 5.4: ESB Design Pattern VETO

VETO (Figure 5.4) stands for Validate, Enrich, Transform, Operate. It is


a widely used integration pattern in ESB solutions. The VETO pattern [3]
ensures that data exchanged in an ESB will be consistent and valid. Each com-
ponent in the VETO pattern might be implemented as a separate service, which
might be configured and modified independently of any other component.
60 CHAPTER 5. ENTERPRISE SERVICE BUS

Validate
The aim of the validate step is to ensure that messages received by the service
provider will have proper syntax and semantics. This step should be performed
independently — not inside the service provider because that solution would
limit the re-usability of validation and complicate any further modifications of it.
Moreover, implementing validation as a separate component would ensure that
every message that gets to the service will be in a proper format, thus would
simplify the design of service provider and enable the Operate step to focus on
business logic. The simplest way of validating an incoming message is to check
whether the message is a well-formed XML document and conforms to the XML
schema or WSDL, but there are also other possibilities, like for example validation
scripts.

Enrich
The aim of the enrich step is to add some additional data to the message content
that would be needed by the service provider, for example, information about
the customer, who has placed order. That information might be fetched from the
database or might be the result of invoking another service.

Transform
The aim of the transform step is to change the message format to the one accepted
by the service provider. This step might transform the message into an internal
message format of the service provider, releasing the Operate step from the need
to perform this task and therefore increasing its efficiency.

Operate
The aim of the operate step is to invoke target service or to interact in some way
with the target application.

5.8.2 VETOR pattern


The VETOR pattern [3] is a VETO pattern which introduces a new component
placed right before the Operate step — the Router. The aim of this step is to
decide whether a message should be delivered to the service or not. The router
might be implemented as a part of the transform component or, more preferably,
as a separate service.

5.8.3 Two-step XRef pattern


In an ESB two types of transformations are taking place: structure and content
transformations. The aim of the structure transformation is to change the format
of a message, while the aim of the content transformation is to enrich a message
with some additional data, usually fetched from the database. This process is
CHAPTER 5. ENTERPRISE SERVICE BUS 61

usually performed in one step, when the output from one step is used as an input
to the following step:

• XSLT transformation

• XPath query

• JDBC query

• SQL statement

The concept of the Two-step XRef pattern [3] (Figure 5.5) is to create two
separated components responsible for only one type of operations:

• XML parsing: XSLT transformation and XPath query

• Database lookup: JDBC query and SQL statement

This approach has a lot of advantages over the previous model:

• a better code re-usability: components for XML parsing and database


lookup might be used in multiple different projects

• an easier and quicker development: both components might be developed


simultaneously by different teams

• loose coupling: problems with the database does not affect the operation
of the XML parsing component

Figure 5.5: Comparison of the internals of a typical transformer


and Two-step XRef transformer
62 CHAPTER 5. ENTERPRISE SERVICE BUS

5.8.4 Forward Cache Integration pattern


After introducing the ESB into a company, the need of having a single application,
which will gather the information from all of the integrated systems and present
them in a coherent visible form, quite often arises. Portal applications perform
such a role. They enable pulling data from multiple sources like other systems and
databases and representing them in a unified way through web pages. There are
very useful, because the user does no longer have to seek for data in many systems,
which might provide various user interfaces (web pages, command line, etc.),
might require different credentials and so on. All the information is available in
one place.
Despite mentioned advantages of the portal applications they also invite new
challenges. In order to present data on the web page the portal must be able to
get it within a couple of seconds. For a lot of integrated systems it might not
be possible to fulfil that requirement. There might be applications designed as
terminal applications, used by a single user, which are not adjusted to handle
the throughput required by the portal application. Moreover, systems accessed
by the portal application must be available 24/7. For some, which were designed
as desktop programs run on a personal computer, this requirement might not
be fulfilled as they might require periodical restarts and are not resistant to
the hardware problems of the computer on which they are running. Because of
geographical separation data from the integrated systems is available only when
there is a network connection between them and the portal system. When the
network connection is broken, the portal application will not be able to present
any data to the user. Thus, it becomes crucial to have a properly working network
infrastructure.
Therefore, it is very important to keep in mind that systems which are about
to be integrated will face a lot of new challenges and problems, which might not
have been considered by the authors of them.
One of the ESB services, which might help with solving some of those problems
is the Cache Service. The task of this service is to store the results of service
invocations returned from service providers. This service combined with the For-
ward Cache Integration pattern [3] enables the portal application to access
the data directly from the Cache Service, once it has been already presented,
even if the system, which supplied that data, will be temporary unavailable.
There are two possible scenarios to implement this solution:

• using the publish-and-subscribe model — constantly inform the Cache Ser-


vice about the changes in data

• using message routing — duplicate every response from the integrated sys-
tem to the Cache Service

Publish-and-subscribe model
In this scenario every change of the data held by an integrated system will cause
sending a message with a set of changes to a message topic. The Cache Service
will be a subscriber of that topic. This solution is only suitable for small computer
CHAPTER 5. ENTERPRISE SERVICE BUS 63

software infrastructures with systems not frequently changing data. It is not hard
to imagine what would happen if there would be multiple integrated systems
constantly changing their own data, then most of the traffic would be consumed
by update-messages making an ESB incapable of handling any regular-messages.

Message routing
This scenario assumes the usage of one of the ESB main components the router.
Every response, before getting back to a portal application, should also be sent
to the Cache Service. In that way, it will have a copy of every information
that used to be presented by the portal application and in case of inaccessibility
of an integrated application that information might be supplied by the Cache
Service.
Chapter 6

Case study: Messaging systems


work principles

Having covered the basic concepts required to get a better understanding of


the application integration topic, these concepts will be gathered together in
form of a case study that would show how to use them when solving a business
problem. The topic of this case study will be an imaginary integration task of
an international company producing toys. It has been created by the authors
of this thesis in order to illustrate the theoretical background, presented so far,
from a more practical point of view.
This company has multiple systems located at distant geographical locations.
The company headquarters are located in Europe, along with storage facilities.
The factories producing toys are located in Asia. The Logistic System is di-
vided into two parts: one responsible for delivering products from factories to
the storage facilities and a second one responsible for delivering ordered products
to the final costumers. The company also has an Internet shop, established in
order to widen the potential number of customers and make its products available
worldwide.
The computer systems infrastructure is comprised of the following:

• Internet Shop — responsible for the interaction with the user and placing
(and confirming) orders
• Orders Fulfilment System — responsible for fulfiling the orders
• Storage System — responsible for providing information about product
supplies
• Pricing System — manages the prices of all products available for pur-
chase
• Loyalty System — responsible for storing information about discounts
for those customers, who purchase most frequently and/or purchase large
quantities of products

65
66 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

• Payment Check System — responsible for checking the status of the


payments for ordered products

• Supply System — responsible for sending orders to the factory in case


the supplies of a given product would run low

• Logistics System — responsible for delivering packages with ordered


products to the customers

• Postponed Orders Fulfilment System — responsible for fulfiling orders


that could not have been finalised earlier because of the lack of products at
the storage

The task at hand is to connect all those system into one big business entity
by using the messaging system. The high level design view on the schema of the
systems after the integration is presented on the figure 6.1.
Of course, each of the systems has to have a number of endpoints attached,
so it would be capable of sending and receiving messages. Message Channels
between the systems must also be set so that the communication can take place.
The connections between the systems must be determined earlier (there is no
point in connecting two systems using a Message Channel if no communication
between them will ever occur). During this process a possible location for placing
additional components, such as Message Routers or Message Transformers, should
also be determined.
For every router the set of rules must be set by which the router will determine
the destination point of the incoming message. If there is a Message Transformer,
the rules for message transformation must also be set.
It also must be decided whether this solution would use a synchronous or an
asynchronous communication model. The system being the subject of this study
should be resistant to the communication failures and as flexible as possible.
Also we do not want the Supply system to stop working and receiving requests
from the Storage System, if it would not get the acknowledgement of the received
order from the factory, and so on. It this case the most suitable model of com-
munication would be the asynchronous one. The usage of this model also means
that much more effort must be put into the design and implementation of the
solution, but it will assure that the final solution will operate in the desired way.
The detailed description of the design and implementation of this case study,
covering all possible issues, could easily cover the whole volume of this thesis.
Because our goal here is to only give a brief taste of the integration task and
show the practical usage of the concepts described previously, we will just give
the brief description of the integration solution, not focus on the technical details.
First, let us take a closer look at the simplest scenario, involving an order made
by the Internet Shop with credit card payment, when there is no need to order
the products from the factory and wait for the fulfilment of the order because
the products are available in the desired quantity at the store. This scenario is
presented on the sequence diagram — figure 6.2.
The user visits the web site of the Internet Store, logs in using his/her user
name and password, browses through the list of available products, selects the
CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES 67

Figure 6.1: Case study systems

one that he/she is interested in and places an order. After the order is being
placed, the Internet Shop sends a request to the Storage System to check if the
selected products are available at the moment. As mentioned before, in our
scenario we assume that the ordered products are available. In the opposite
case, the Storage System would send a request to order them to the Supply
System. Then the Supply System, after the amount of products needed would
reach a specified quantity, sends a request to the factory to make those goods
produced and delivered to the storage in Europe and then forwards the order to
68 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

Figure 6.2: Case study processing sequence

the Postponed Orders Fulfilment System.


When the Storage System verifies that the requested products are available,
it sends back the response to the Internet Shop application. The Internet Shop
also sends a request to the Pricing System to check the prices of the products.
The Pricing system sends a request to the Loyalty System, which checks if the
user, placing the order, has any discounts. After receiving the response it calcu-
lates the prices and sends the response back to the Internet Shop.
Having gathered all data the Internet shop application displays information
about the order along with calculated prices. When the user confirms the place-
ment of the order, it is passed to the Order Fulfilment System, which manages
the fulfilment of placed orders. Depending on the selected payment method the
system sends a message to the Payment Check System in order to check if the
payment has been made, and the order can be processed further (in the case of
a money transfer, payment by credit card) or send request directly to the Storage
System (in case of payment upon delivery) to prepare the package containing or-
dered product (the same request is being sent when the message is received from
the Payment Check System in case of the payment methods mentioned above).
When the Order Fulfilment System receives a message from the Storage Sys-
tem that the package is ready for a shipment it sends a message to the user that
the order has been completed and is ready for shipment (using e-mail address
CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES 69

of the user). The message is also being sent to the Logistics System to add the
package to the list of packages to be picked up and delivered on the next day.
When the package is sent out to the client the message is being sent to the Order
Fulfilment System to notify that the package has been sent, upon receiving this
message the system sends an notification e-mail to the user informing that the
package has been sent and stating the approximate delivery time. Sending this
notification e-mail finishes the order processing by the system.
As it can be seen, even in the case of the simplest scenario the interaction
between the involved systems is quite complex. There are many systems involved
in the process of exchanging different types of information. It is worth keeping in
mind that the data being sent is the subject of many changes and prepossessing
before it can be consumed by the next system in the processing chain (different
data formats, internal data model of the applications, and so on).
Designing the integration solution for this system would require the usage of
all of the components explained in the previous chapters, i.e. Message Routers,
Message Translators, Message Endpoints. The Message Routers could be used
to determine the destination of a message in the case when one system can send
messages to different receivers. The Message Translators could be used to trans-
late data contained in those messages so that they would fit the internal data
model of the receiver system.
The schema of the systems integrated by the messaging system and incorpo-
rating the elements mentioned above is presented on the figure 6.3.
Although the diagram 6.3 may look simple and straightforward, it is, in fact,
an example of a badly — tightly coupled — designed computer software infras-
tructure. One of the disadvantages of this solution is the unmanageable number
of message channels (depicted on the figure as arrows). Although, the major
disadvantage is the fact that an application, in order to communicate with other
applications, must know a lot of details about it, like for example message channel
addresses and message formats. Moreover, every time a message format changes
in one application, all applications communicating with that application also
have to be updated. For example, if the Storage System will change the format
of the date field, applications such as the Logistics System and the Order Fulfil-
ment System also would have to change the format of the messages that they are
sending to the Storage System.
The solution of the previously mentioned problems might be the usage of inte-
gration patterns — the Message Router (3.4) and the Message Transformer (3.5).
Figure 6.4 presents a new architecture utilising those concepts. Despite the fact
that the amount of message channels has increased, this solution enables greater
decoupling of the applications. The knowledge about the format of messages
accepted by the system is now not hard-coded inside each application, but is del-
egated to a new, intermediary component — the Message Transformer (depicted
on the figure as the letter T). Also decisions about the routing of messages are
not taken by each application, but by the Message Router (depicted on the figure
as the letter R). This approach introduces a greater level of loosely coupleness —
the message format of each application might be changed independently of the
others.
Each system shown on the figure 6.4 has been enriched by the endpoints
70 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

Figure 6.3: Case study: Message Channel scenario

that enable the communication between the system and the messaging system.
The number of endpoints corresponds to the number of channels from which
the given system can receive messages or to which it can send them. Message
Routers are used to direct the messages to their destinations basing on given
business rules (e.g. name of the destination system placed within the message
header). The Router connected to the Internet Shop channel decides whether to
send a message to the Storage System (checks the availability of the product) or
to the Pricing System (gets the prices for a given product) or to the Orders Fulfil-
ment System, which passes it (a message containing information about a placed
order) on for further fulfilment. The router connected to the Order Fulfilment
System channel routes messages either to the Storage System (a message con-
taining a request to create the package for shipment) or to the Payment System
(checks whether the payment for the ordered products has been made) or to the
Internet Shop (notification messages about the various stages of order fulfilment).
Message Transformers are used to transform Messages to the format readable by
CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES 71

Figure 6.4: Case study: Message Channel with Router and Translator scenario

the recipient system (e.g. Messages sent from the Storage System to the Inter-
net Shop need to be transformed by the Transformer component in order to be
correctly read by the destination system, in particular in the case of the Internet
Shop). Let us assume that the Order Fulfilment System stores information in such
a way that the receivers name and surname are combined together in the field
receiver and the address is stored in the field package destination. Thus, the role
of the transformer would be to extract the receivers name and surname and put
them in the field receiver and then extract the information about the destination
address and put it in the field package destination. Only after those transforma-
tions the message can be sent further to the Order Fulfilment System. After those
changes it can be assured the data will be read and interpreted correctly by the
system and will not cause any errors during its processing.
The problem of the unmanageable number of message channels is resolved by
the introduction of an Enterprise Serivce Bus (Figure 6.5). In this scenario, each
of the systems communicates only with the ESB, which is responsible for mes-
72 CHAPTER 6. CASE STUDY: MESSAGING SYSTEMS WORK PRINCIPLES

Figure 6.5: Case study: Enterprise Service Bus scenario

sage transformation and routing. This approach simplifies the development and
management of applications, because there is only one message channel between
the application and the ESB. Moreover, an Enterprise Service Bus provides wide
range of adapters — components for accessing an ESB, which removes the need
for an application to know details of communication with the ESB.
An Enterprise Service Bus incorporates the implementation of multiple inte-
gration design patterns, such as the Message Channel (3.3), the Message
Router (3.4), the Message Transformer (3.5), the Message Endpoint (3.6), etc.
But the internals of that implementation are hidden from the applications using
an ESB. This simplifies their design and makes the process of integration easier.
The example described above should give a good overview on the usage and
practical application of the concepts presented in the previous chapters. Using the
most basic components such as a Message Router and a Message Translator,
a complex integration solution might be designed.
Chapter 7

Implementation

This chapter will cover the details of the implementation of the ESB platform
named pESB. It was developed by the authors of this paper as an internal part
of this master thesis.
First, the basic concept behind this product will be described, its technology
and architecture. Later, the internal working of it, illustrated by the sequence
diagram, will be presented. Having covered the essential information, a real life
example, which will familiarise the reader with the process of configuring the
ESB, along with source code snippets, will be introduced. Finally, the problems
that occurred during the implementation will be presented.

7.1 The origin of the name


pESB is an acronym for Polish-Japanese Institute of Information Technology
Enterprise Service Bus. Because the authors of pESB belong to a group of busy
people that do not want to waste time typing long names, the name pESB will
be used throughout this chapter.

7.2 Concept
Being aware of our limited amount of time and available resources we decided to
create an integration solution that will be lightweight and easy to use in the first
place. We knew that we would not be able to create a tool that will compete
with integration products existing on the market for a couple of years. Thus, we
decided to create an integration product that will provide simple functionality,
but will have an architecture that will enable easy scalability and further devel-
opment.
The main concept of this approach to an ESB solution was to base implemen-
tation on the standards and technology already available on the market, like Java
Enterprise Edition (EJB, JMS), XML, WebServices, etc. This approach is very
different compared to other existing ESB solutions. pESB takes advantage of

73
74 CHAPTER 7. IMPLEMENTATION

the services provided by the application server like security management, trans-
action management, pool and resource management, etc. It does not come with
its own Message Queue implementation. In our opinion this should be treated
as an advantage, because we do not force users to use any particular solutions.
Nowadays, every company thinking about integration solutions has already an in-
frastructure that might be used. Furthermore, our solution does not reinvent the
wheel. The products providing those features have been available on the market
for a long time, and certainly the authors of those products had to overcome
a lot of problems and solve a lot of issues that appeared during the usage of their
products by the customers.
pESB was designed as a multi-tier application (Figure 7.1). The basis of this
solution is the Message Bus. Its main responsibility is to provide reliable method
of communication between multiple points using data channels.
The next layer is the Application server providing a lot of facilities, which
an application running in a container can take advantage of. Mentioned facilities
are as follows:

• Transaction management — container is responsible for starting transac-


tions, committing, and automatic rollback in case of an error
• Security management — authentication and authorisation done by the con-
tainer
• Pool management — the container is also responsible for managing pools
of Enterprise Java Beans instances, pools of connections to the resources
(allocating new, removing not used)
• Resource management — the container provides an application with the
connectivity to the resource like databases or JMS queues
• Multi-threading — the container is responsible for starting threads and
monitoring their work

pESB as an application running in the container might take advantage of all


those facilities mentioned above.

7.3 Technology
pESB is an enterprise application using the set of state-of-the-art Java technolo-
gies:

• Java 6 — runtime Java source code compilation


• EJB 3.0 — no XML descriptors, the whole configuration written in anno-
tations

pESB has been implemented on Glassfish V2 application server, but since


there are no XML descriptors specific to the chosen application server porting
pESB to any other EJB3-compliant application server would not require much
effort.
CHAPTER 7. IMPLEMENTATION 75

Figure 7.1: Integration architecture layers

7.4 Classification
According to the classification presented in the Forrester Wave comprehensive re-
port about Enterprise Service Bus products available on the market[13] customer
requirements might be yielded into two segments:

• ”keep it simple”

• ”I want it all now”

For the first group of customers the most important thing is that the solu-
tion should be simple as possible in order to enable low-cost integration. Also, it
should have a plug-in architecture to enable easy customisation to customer’s
needs. The products from the second group, on the other hand, should feature
wide range of additional services like Business process management, process sim-
ulation, monitoring or optimisation. pESB will suit customers from the first
segment.
76 CHAPTER 7. IMPLEMENTATION

7.5 Architecture
The main focus in the development of pESB has been put not on the optimi-
sation of particular methods or functionalities but on having an architecture,
which will enable scalability, distribution, reliability and security of the solution.
One of the ways to achieve that goal was to use asynchronous processing over
synchronous. The internal communication between components of the pESB is
done using asynchronous messages (Figure 7.2). The message processed by one
component is put into the queue of the other component. Every message is per-
sisted. The combination of those two factors provides scalability and reliability.
Message queues enable to have multiple consumers running on different com-
puters. Thus, the number of consumers may change dynamically, new con-
sumers might be added and removed on the fly. This feature provides scalability,
pESB might be easily adjusted to the growing requirements of processing greater
amount of the messages, simply by adding new application servers.

Figure 7.2: pESB architecture

The overall process of communication is depicted on figure (Figure 7.3).


The client communicates with the ESB using an Agent, which is a software com-
CHAPTER 7. IMPLEMENTATION 77

ponent providing an API to access the pESB. From the client’s point of view
(depicted on the figure as System1 and System2) the whole complexity of mes-
sage based communication, asynchronous processing, XML documents, etc. is
hidden. pESB except providing an interface for receiving and fetching messages
also provides a method of dynamic configuration (interface Config) using Java
API.

Figure 7.3: pESB communication architecture

7.6 Processing sequence


In the presented processing sequence (Figure 7.4) there are three participants:

• a Sender system, named System1

• the pESB

• a Receiver system, named System2

System1 is communicating with System2. Both of the systems are using


agents to communicate with the integration platform. The agent is a software
component responsible for the whole communication with the pESB. The details
of that communication, whether it is done using EJB or SOAP/HTTP is hidden
from the client (System1 and System2). Moreover, to simplify the design of
System1 and System2 clients they operate on Data Transfer Objects (DTO), not
on XML documents.
These are the actions that happen when System1 sends a message to the
pESB and when System2 is the set as a receiver of that message:
78 CHAPTER 7. IMPLEMENTATION

Figure 7.4: pESB processing sequence

1. System1 invokes the send method from the Agent passing as an argument
Data Transfer Object, which it would like to send to the ESB.

2. System1’s Agent receives the DTO and creates an XML message out of the
DTO content, then invokes the method from the remote interface of pESB
using EJB and passes that XML document as an argument.

3. The ReceiverBean in pESB receives that XML document and puts it into
an input queue and returns information back to the Agent about the status
of that operation.

4. After some time...

5. The Message-Driven Bean named TransformerBean in pESB fetches mes-


sage from the queue, asks the Transform and routing engine to perform
transformation and routing of that message, then:

• If the message is already processed by pESB and is about to the routed


to the destination system, sends it to the output queue.
• If not, sends it once again to the input queue.

6. After some time...

7. System2 wants to fetch new messages, it invokes the receive method from
the Agent.

8. System2’s Agent invokes the method from the remote interface of pESB
using EJB.
CHAPTER 7. IMPLEMENTATION 79

9. The FetcherBean in pESB checks whether there are any new messages wait-
ing in the output queue for System2, if there is a new message, fetches it
and returns back to the System2’s Agent.

10. System2’s Agent creates a new Data Transformer Object using the content
from the XML document and returns that new object back to System2.

It is worth mentioning that all the operations performed inside the Transform
routing unit are performed in a transaction. This means that in case of an error
no message will ever be dropped.

7.7 Configuration
All the configuration is done in the runtime, no restart or reload is required
after changing the configuration, also no XML file editing is required. The whole
configuration might be changed using a Java API. Furthermore, even a Java code
provided by the user is compiled on the fly and functionality provided by it is
available right away.
Each component in the pESB — router or transformer — might be configured
in multiple ways. That is, the transformer might be configured using either an
XSLT file or a Java code. It is up to the user which method of configuration he/she
will choose. In the case of a Java code method, the source code provided by the
user is compiled on the fly, during the process of configuring. This approach
has a lot of advantages over dynamic execution of the code. The compilation
process can detect a lot of errors, which in the case of a dynamic execution would
be found out only at the runtime. Moreover, the user in order to develop a
component might use an IDE of his own — Eclipse, Netbeans or IDEA, etc.,
which gives him/her the possibility to avoid common programming mistakes like
misspelling variable names, methods, using wrong types and so on.
The process of configuration pESB is performed using a Java API. We believe
that this solution provides the greatest flexibility, because it does not enforce the
way the configuration data will be stored. In this approach, it is possible to save
the configuration in an XML file, a LDAP directory, a relational database, etc.
The only thing that must be done is the creation of the import tool, which will
read the data from the particular medium (XML, LDAP, database), parse it and
invoke proper methods from the Java API. Moreover, this approach also enables
to have some addition logic in the import tool itself, which will be called before
using the configuration Java API. Furthermore, it is easier to modify an import
tool (XML or database) than a system (pESB).

7.7.1 Configuration model


The model of pESB configuration (Figure 7.5) incorporates two components:

• System

• Transform Routing Unit (TRU)


80 CHAPTER 7. IMPLEMENTATION

The System represents a system taking part in the communication. Every


System has an input — responsible for receiving messages from the application
and forwarding them to the pESB, and an output — responsible for receiving
messages from the pESB and forwarding them to the application.
The Transform Routing Unit represents a special unit responsible for trans-
formation and/or routing. Every TRU has an input and an output, which can be
either another TRU or a system. This approach gives great flexibility to the one
responsible for configuring pESB to particular integration needs. A message from
the output of one TRU might be sent to the input of another TRU. Moreover,
this unit might be considered as an independent component — with input and
output and business logic responsible for transformation and routing, what en-
ables re-usage of that component in different integration projects or by multiple
systems.

Figure 7.5: pESB configuration model

7.7.2 Configuration example


To better understand the process of configuring pESB let us present an example
of the pESB configuration. The example is based on the case study described
in the chapter ”Case Study” (6). Because the case study is complex and in-
volves multiple systems, only an excerpt of the whole integration solution will be
presented in this section.
The presented integration solution will only involve the following systems:

1. Internet Shop — responsible for the interaction with the user and placing
(and confirming) orders
CHAPTER 7. IMPLEMENTATION 81

2. Orders Fulfilment System (Order System) — responsible for fulfilling the


orders

3. Storage System — responsible for providing information about product


supplies

One of the requirements of the proposed solution is that it must make an


application independent of other applications’ message formats. A message for-
mat change of one application cannot enforce the change of message formats of
applications communicating with it. For example, an Internet Shop message for-
mat change should not enforce a change in the Orders Fulfilment System message
format.
Moreover, the business logic responsible for making a decision where particular
message should be forwarded must be separated from the application. This should
be provided by the component, which might be changed independently of any
other application.
The above requirements can be fulfilled by the pESB using the following
configuration (Figure 7.6):

1. Systems — the input and output of the messages:

• InternetShop
• OrderSystem
• StorageSystem

2. Transform Routing Units — responsible for transformation of message for-


mat:

• InternetShop
• OrderSystem
• StorageSystem

3. Transform Routing Unit named Router — responsible for routing of the


messages basing on business rules

7.7.3 More on transformers and routers


A component configuration is designed in a plug-in architecture, which means
that it is possible to extend a basic set of configuration methods with new ones.
Transformer and router are actually names of the interfaces, thus in order to
create new method of configuration — one must simply create a new class that
will implement the interface. pESB comes with two implementation of the trans-
former interface — the Java code and the XSLT file and one implementation of
the router interface — the Java code. This feature gives the user the possibility
to decide using which technology particular component will be configured.
82 CHAPTER 7. IMPLEMENTATION

Figure 7.6: pESB configuration example

7.7.4 Performing the configuration


The process of configuring pESB consists of two steps:

• creating Transform Routing Units (with Routers and Transformers) and


Systems

• creating connections between Systems and TRUs

The following snippet of code creates a Transform Routing Unit object with
XSLT Transformer:
String xsltCode = "<?xml ...";
TransformerDTO transformerDTO = new TransformerDTO(TransformerDTO.TYPE_XSLT);
transformerDTO.setParameter(TransformerDTO.CODE, xsltCode);

TransformRoutingUnitDTO truDTO =
new TransformRoutingUnitDTO("OrderSystem", transformerDTO, routerDTO);

Once Transform Routing Unit object has been created (variable truDTO) it
must be registered at pESB:
int result1 = bean.registerTransformRoutingUnit(truDTO);
// (bean is EJB remote interface of pESB configuration bean)

Variable result contains the status of registration, 0 in case of a success


and −1 in case of an error.
After creating the Transform Routing Units two System, systems must be
created:
SystemDTO systemDTO = new SystemDTO("OrderSystem");
int result2 = bean.registerSystem(systemDTO);
CHAPTER 7. IMPLEMENTATION 83

The next step in the process of configuration is creating links between Trans-
form Routing Units and Systems. The following snippet creates the appropriate
connections:
int result3 = bean.setTransformRoutingUnitInput(truDTO, systemDTO);
int result4 = bean.setTransformRoutingUnitOutput(truDTO, systemDTO);

int result5 = bean.setSystemOutput(systemDTO, truDTO);


int result6 = bean.setSystemInput(systemDTO, truDTO);

It is worth mentioning that the links should be created on both sides, which in
context of the above example means that the OrderSystem output and input must
be connected with the TRU, and the TRU input and output must be connected
with OrderSystem.

7.8 Integration design patterns supported by pESB


This implementation of the Enterprise Service Bus utilises the following integra-
tion design patterns:

• Message (3.1) — data record used to exchange data between an application


and pESB and also between internal components of pESB

• Message Channel (3.3) — in order to provide reliability (the guarantee


that no message will ever be lost) all the communication between internal
components of the pESB is performed using JMS Queues

• Message Router (3.4) and Message Translator (3.5) — this functionality


is provided by a user-defined Router and Transformer, which is a part of
a Transform Routing Unit (7.7.1). Because pESB does not put any restric-
tions on the operations that might be performed on the received message
inside those components, a lot of variants of those two integration design
patterns might be created, such as the following:

– Fixed Router, Content-Based Router, Context-Based Router (3.4)


– Message Filter, Recipients List (4)
– Content Enricher, Content Filter (4)

Despite the simplicity of this integration solution it is also possible to create


other, more complex, integration design patterns, such as the following:

• Normalizer (4) — this design pattern has been actually used to solve the
case study integration problem (7.7.2), the solution consists of multiple
Transform Routing Units performing the role of Message Translators and
one central Transform Routing Unit performing the role of a Message
Router

• Message Broker (4) — pESB itself might be perceived as a Message Broker


pattern. pESB is a central component of the system where all message
84 CHAPTER 7. IMPLEMENTATION

routing operations are being performed. It also reduces the number of mes-
sage channels, because an application in order to be able to communicate
with other systems needs only one message channel — to the pESB.

7.9 Problems encountered during the implementa-


tion
During the implementation phase we realised that it is a very complex task to
implement functionality enabling receiving of the messages in the push mode —
mainly because, such a method requires a distributed transaction. pESB after
notifying the client that there is a pending message would have to wait until the
client fetches that message. This would have to be done in a transaction, because
the client currently may experience some problems and would not be able to
fetch a message (or there might be a timeout on communication or some other
problem). In such situations the message should be returned back to the queue.
Having in mind the fact that pESB is an application running in an EJB container,
where transactions are handled automatically by the transaction manager, this
could be a challenging task. Moreover, the mechanism of notification would
require pESB to connect to the client, In order to facility this the client would
have to have some port open, on which pESB could reach him. That in turn
would put additional requirement on the client, not only on the design of it,
but also on configuration (i.e. it would not be possible to use the client behind
a firewall). The solution to the above might seem to be initialising the connection
to the client from the pESB side, but that would complicate pESB and would
require pESB to have a mechanism to reinitializing the connection after it has
been broken, because of the problems experienced by the client. Again, keeping
in mind the fact that pESB is the application running in an EJB container that
hinders the manual (performed by an application) management and sharing of
the connections, because resource management is being done by the application
server, not directly by an application, it makes that solution very difficult to
implement.
Despite a lot of advantages of the push mode approach we have decided to
implement the pull mode.
Chapter 8

Summary

The integration problem has been well-known since the 70s. Since that time
a lot of technologies and integration styles have been in use, some of them have
been described in chapter 2. The latest approach to the integration problem is
an Enterprise Service Bus, based on the concept of messages and asynchronous
communication. Our software project — the implementation of an ESB — tries
to fulfil the gap on the market of such products. The main focus in its develop-
ment was put on the usage of well-known, well-tested, already existing products
and technologies available on the market, such as: Java, Java Enterprise Edition
(Java EE) application server performing the role of a container (hosting platform)
for an Enterprise Java Bean (EJB) application, Java Message Service (JMS) and
XML — because, we believe that those technologies combined with the deliber-
ated architecture will enable our product to be a reliable, efficient, secure and —
the most important thing in the era of constantly growing IT systems — scalable
integration solution.
The domain of this paper — integration — usually concerns large IT systems,
quite often systems, which are crucial for the operation of the company, thus
introducing a solution, which will not be of high quality is not an option. A high
quality solution is the one which is reliable, secure, easily scalable, distributed and
having high performance. Only such solutions might be created as an answer for
the integration problem. Because of the reasons mentioned at the beginning, there
is no place for not well tested, buggy, unreliable software products. This, as it is
well known in IT, is very difficult to achieve. To help in solving this challenging
task integration, the integration design patterns might come handy.
As it has already been mentioned in the introduction to this thesis, due to the
time and resources limitations our effort has been directed in such a way so that
the created application would fit into an empty niche. This niche creates a space
for a lightweight solution suitable to solve most common integration problems.
For obvious reasons, it was not possible to create a fully functional software in
such a short time that would provide functionality comparable to the solutions
delivered by large software vendors. Never the less, aiming for this niche made
it possible to create a fully functional software, and, at the same time, gaining

85
86 CHAPTER 8. SUMMARY

knowledge about the topic of application integration. The main assumptions that
have been made for the application have been outlined in the first chapter of this
paper. Now when it is finished and working, we can say that those assumptions
have been fulfilled. The ease of use and simplicity have been our main goals and
they have been achieved. We not only managed to describe the topic that we
have decided to undertake from the theoretical point of view, but we also used
this theoretical knowledge while creating our application. We also managed to
apply designed patterns presented in this paper in a practical way, in order to
obtain the desired results.
Due to the shortage of time our application lacks a graphical user interface
that would simplify its usage. Creating such a user interface would signifi-
cantly speed up the process of designing an integration solution. This feature
might be one of the possible ways of the further development of this application.
Another way is to enrich its functionality by adding a larger number of design
patterns for the user to choose from. While adding those new patterns, the over-
all simplicity of the solution should be kept in mind, as it is one of the most
important features of this application, that cannot be lost during its further de-
velopment.
The application that has been created during our work can be seen as a base
system, which can be further extended in various ways — to improve its function-
ality and ease of use. Sample ways in which it can be extended have been already
pointed out in the previous paragraph. Those are of course only the extensions
that in our opinion would contribute most significantly to the development of
our system. Other possibilities and ways of developing our application can also
be applied. By gradually developing it and adding new features, a very powerful
integration tool can be created, which can be used to solve not only the basic
integration problems but also more sophisticated ones. Of course, it will not be
able to compete with solutions provided by the large IT companies, but over time
it can become an interesting alternative to complicated and expensive solutions.
Our work proved that the knowledge of integration design patterns is essential
— not only for the designers and developers of integration solutions, but also
for the people involved in creating IT products. Nowadays, even the simplest
applications, created to simplify everyday routines may one day be integrated
into a large computer software infrastructure. Awareness of the problems covered
in this thesis will enable to design and implement solutions — flexible enough to
one day become a part of some other — larger — system.
Bibliography

[1] Christopher Alexander. A Pattern Language: Towns, Buildings, Construction.


Oxford University Press, 1977. [cited at p. 43]

[2] Christoph Bussler. B2B Integration. Springer, 2003. [cited at p. 20]

[3] David Chappell. Enterprise Service Bus. O’Reilly Media, Inc., 2004. [cited at p. 53,
59, 60, 61, 62]

[4] Ralph Johnson John Vlissides Erich Gamma, Richard Helm. Design Patterns: El-
ements of Reusable Object-Oriented Software. Addison-Wesley Professional, 1995.
[cited at p. 43]

[5] Thomas Erl. Service-Oriented Architecture (SOA): Concepts, Technology, and


Design. Prentice Hall PTR, 2005. [cited at p. 21]

[6] Martin Fowler. Patterns of Enterprise Application Architecture. Addison-Wesley


Professional, 2002. [cited at p. 11]

[7] Bobby Woolf Gregor Hohpe. Enterprise Integration Patterns: Designing, Build-
ing, and Deploying Messaging Solutions. Addison-Wesley Professional, 2003.
[cited at p. 7, 17, 23, 24, 26, 27, 30, 32, 33, 35, 36, 38, 39, 44, 45, 46, 47, 48, 49, 50]

[8] William Grosso. Java RMI. O’Reilly Media, Inc., 2001. [cited at p. 11, 26]

[9] Stany Blanvalet Jeremy Bolie, Michael Cardella and Matjaz Juric. BPEL Cook-
book: Best Practices for SOA-based integration and composite applications devel-
opment. Packt Publishing, 2006. [cited at p. 52, 57]

[10] Ann Wollrath Sam Kendall Jim Waldo, Geoff Wyant. A Note on Distributed
Computing. Sun Microsystems Laboratories, Inc., 1994. [cited at p. 27]

[11] Doug Kaye. Loosely Coupled: The Missing Pieces of Web Services. RDS Press,
2003. [cited at p. 9]

[12] Susan Bishop Alan Hopkins Sven Milinski Chris Nott Rick Robinson Jonathan
Adams Paul Verschueren Martin Keen, Amit Acharya. Patterns: Implementing
an SOA Using an Enterprise Service Bus. IBM Corp., 2004. [cited at p. 55]

[13] Ken Vollmer Mike Gilpin. The Forrester Wave: Enterprise Service Bus, Q4 2005.
Forrester Research, Inc., 2005. [cited at p. 75]

87
88 BIBLIOGRAPHY

[14] Roy W. Schulte. Predicts 2003: Enterprise Service Buses Emerge. Gartner, Inc.,
2002. [cited at p. 51]

[15] Kim Williams Scott McLean, James Naftel. Microsoft .NET Remoting. Microsoft
Press, 2002. [cited at p. 11]

[16] Venky Shankararaman Wing Lam. Enterprise Architecture and Integration: Meth-
ods, Implementation and Technologies. IGI Global, 2007. [cited at p. 7]
Appendices

89
List of Figures

2.1 File Transfer style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23


2.2 Shared Database style . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Remote Procedure Invocation style . . . . . . . . . . . . . . . . . . . 26
2.4 Messaging style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.1 Design Pattern Message . . . . . . . . . . . . . . . . . . . . . . . . . 30


3.2 Design Pattern Channel . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Design Pattern Router . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 Design Pattern Translator . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Design Pattern Endpoint . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.6 Overview of a communication based on message design patterns . . . 39

4.1 Design Pattern Content-Based Router . . . . . . . . . . . . . . . . . 44


4.2 Design Pattern Message Filter . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Design Pattern Dynamic Router . . . . . . . . . . . . . . . . . . . . . 45
4.4 Design Pattern Recipients List . . . . . . . . . . . . . . . . . . . . . . 45
4.5 Design Pattern Splitter . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.6 Design Pattern Aggregator . . . . . . . . . . . . . . . . . . . . . . . . 46
4.7 Design Pattern Routing Slip . . . . . . . . . . . . . . . . . . . . . . . 46
4.8 Design Pattern Process Manager . . . . . . . . . . . . . . . . . . . . 47
4.9 Design Pattern Message Broker . . . . . . . . . . . . . . . . . . . . . 48
4.10 Design Pattern Envelope Wrapper . . . . . . . . . . . . . . . . . . . . 48
4.11 Design Pattern Content Enricher . . . . . . . . . . . . . . . . . . . . 49
4.12 Design Pattern Content Filter . . . . . . . . . . . . . . . . . . . . . . 49
4.13 Design Pattern Normalizer . . . . . . . . . . . . . . . . . . . . . . . . 50

5.1 Overview of a communication based on RPC and MOM . . . . . . . 53


5.2 Tightly coupled system architecture compared to ESB . . . . . . . . 54
5.3 Conceptual overview of an Enterprise Service Bus components . . . . 59
5.4 ESB Design Pattern VETO . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 Comparison of a typical transformer and Two-step XRef . . . . . . . 61

6.1 Case study systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67


6.2 Case study processing sequence . . . . . . . . . . . . . . . . . . . . . 68
6.3 Case study: Message Channel scenario . . . . . . . . . . . . . . . . . 70

91
92 LIST OF FIGURES

6.4 Case study: Message Channel with Router and Translator scenario . 71
6.5 Case study: Enterprise Service Bus scenario . . . . . . . . . . . . . . 72

7.1 Integration architecture layers . . . . . . . . . . . . . . . . . . . . . . 75


7.2 pESB architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.3 pESB communication architecture . . . . . . . . . . . . . . . . . . . . 77
7.4 pESB processing sequence . . . . . . . . . . . . . . . . . . . . . . . . 78
7.5 pESB configuration model . . . . . . . . . . . . . . . . . . . . . . . . 80
7.6 pESB configuration example . . . . . . . . . . . . . . . . . . . . . . . 82
List of Tables

2.1 Trade-offs between synchronous and asynchronous model . . . . . . . 22

3.1 Levels of data transformation . . . . . . . . . . . . . . . . . . . . . . 37

93
Index

Aggregator, 45 message sequence, 30


asynchronous communication, 20 Message Transformer, 36
Message Translator, 36
big-endian system, 13 messaging integration style, 27

command message, 30 Normalizer, 49


Content Enricher, 48
Content Filter, 49 Point-to-Point Channel, 33
Content-Based Router, 35, 44 Process Manager, 47
Context-Based Router, 35 Publish-Subscribe Channel, 33
COTS, 7
Recipients List, 45
data timeliness issue, 20 response-reply message, 30
Datatype Channels, 33 Routing Slip, 46
Dead Letter Channel, 33
Dead Message Queue, 33 shared database integration style, 24
design pattern, 43 single point of failure, 25
document message, 30 small-endian system, 13
Dynamic Router, 44 Splitter, 45
synchronous communication, 20
Enterprise Service Bus, 51
Envelope Wrapper, 48 tight coupling, 10
event message, 30 Transaction Management, 56
Two-step XRef pattern, 61
file transfer integration style, 22
Fixed Router, 34 VETO pattern, 59
Forward Cache Integration pattern, 62 VETOR pattern, 60

integration style: Remote Procedure Invo-


cation, 26
Invalid Message Channel, 33

legacy application, 7
loose coupling, 9

message body, 30
Message Broker, 47
message channel, 32
Message Endpoint, 38
Message Filter, 44
message header, 30
Message Oriented Middleware, 52
message properties, 30
Message Router, 34

95