Anda di halaman 1dari 49

University of Twente

Literature Study
Business Process Management in
the cloud: Business Process as a
Service (BPaaS)
Author:
Evert Duipmans
Supervisor:
Dr. Lus Ferreira Pires
April 1, 2012
Contents
1 Introduction 3
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Structure of the report . . . . . . . . . . . . . . . . . . . . . . 5
2 Business Process Management 6
2.1 BPM lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Business Process Management and Notation (BPMN) 8
2.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Orchestration vs. Choreography . . . . . . . . . . . . 11
2.4 Enactment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Business Process Management System . . . . . . . . . . . . . 12
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Cloud Computing 14
3.1 General benets and drawbacks . . . . . . . . . . . . . . . . . 14
3.2 Service models . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Infrastructure as a Service (IaaS) . . . . . . . . . . . . 16
3.2.2 Platform as a Service (PaaS) . . . . . . . . . . . . . . 18
3.2.3 Software as a Service (SaaS) . . . . . . . . . . . . . . 19
3.3 Cloud types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Public Cloud . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.2 Private Cloud . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.3 Hybrid Cloud . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.4 Community Cloud . . . . . . . . . . . . . . . . . . . . 21
3.4 Cloud providers . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.1 Amazon . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.2 Microsoft Windows Azure . . . . . . . . . . . . . . . . 23
3.4.3 Google App Engine . . . . . . . . . . . . . . . . . . . 26
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1
4 BPM and cloud computing 29
4.1 General benets and drawbacks . . . . . . . . . . . . . . . . . 29
4.2 Service Models . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.1 IaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.2 PaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.3 SaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Combining on-premise and cloud . . . . . . . . . . . . . . . . 32
4.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.2 Optimal distribution . . . . . . . . . . . . . . . . . . . 34
4.4 Social BPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.5 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5.1 Product overview . . . . . . . . . . . . . . . . . . . . . 36
4.5.2 Oracle Fusion . . . . . . . . . . . . . . . . . . . . . . . 36
4.5.3 Appian . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.5.4 Fujitsu InterstageBPM.com . . . . . . . . . . . . . . . 38
4.5.5 Barium Live . . . . . . . . . . . . . . . . . . . . . . . 38
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5 Research directions 40
5.1 BPEL engine deployment in the cloud . . . . . . . . . . . . . 40
5.2 Social BPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3 Process decomposition . . . . . . . . . . . . . . . . . . . . . . 41
6 Conclusion 44
2
Chapter 1
Introduction
This work investigates the possibility of integrating Business Process Man-
agement (BPM) with cloud computing. Both subjects are described in more
detail and already existing research about the subject and already existing
tools are investigated. This chapter gives the motivation behind the subject,
the objectives and approach of the research, and explains the structure of
the report.
1.1 Motivation
BPM has gained a lot in popularity the last two decades. By identifying and
managing business processes, companies get insight in what they are doing,
which parts of the process can be automated or can be optimized. This may
lead to lower costs, better customer satisfaction or optimized processes for
creating new products at lower cost [30].
Business Process Management Systems (BPMS) are able to capture business
processes and keep track of running instances of these processes. A business
process can be described using workows. Workows consist of activities
which are either human-based, system-based or a combination of the two in
case of human-system interaction. A BPMS contains a workow execution
engine which coordinates the execution of a business process step by step.
Each process instance is monitored by the BPMS and business users are able
to look into these process instances. The process instances give insight into
the progress of the processes and show if processes are completed success-
fully, or have failed. In case of failure, the process instance shows in which
part of the process the failure occurred. By monitoring, evaluating and
changing business processes, companies are able to optimize their processes.
To be able to control and monitor a business process, a BPMS needs to
3
communicate with the participants in the process. Introducing BPM into
a company might lead to integration issues, especially since legacy software
may not have standardized interfaces. Not all software that participates in
the business process oers clear interfaces for controlling or monitoring the
software.
Nowadays a lot of software vendors do oer their products as a service. The
software is hosted on a server and customers are able to login to the software
using the Internet, mostly by using a browser. One of the big advantages
here is that companies can change their product relatively easily without
having to distribute the updates to all their customers. Instead, they only
have to adjust the software that is running on their own servers. Many
of these services are built using the Service-Oriented Architectures (SOA)
principles. SOA[27] allows designing and developing software services in a
uniform way, so that reuse is enabled. This gives developers support for
integrating and composing services.
Oering software as a service gives providers also new challenges. In the
previous situation, where software was shipped to the user, the hardware on
which the software would run was provided and managed by the user. In the
new situation the software runs on the machines of the provider. This means
that the provider is responsible for the hardware and that maintenance has
to be performed by the provider. The provider has a choice to invest in
servers and personnel to perform these tasks, or the choice can be made
to outsource the hosting of the software to a third-party provider. Cloud
computing [9] is an example of a model where computing resources are
oered to the user as a service.
The NIST describes cloud computing as a model for shared congurable
computing resources that can be rapidly provisioned and released with min-
imal management eort or service provider interaction [25]. Cloud comput-
ing provides users with 3 service models:
Software as a Service (SaaS) SaaS is a model in which software is of-
fered as a service to the user. The software is hosted on a server and
users access the software by using a web browser.
Platform as a Service (PaaS) PaaS is the oering of a computing plat-
form as a service. Users are able to deploy their applications on such
a platform. The platform oers auxiliary functionality such as a web
server, databases, load balancing and more.
Infrastructure as a Service (IaaS) With IaaS, the cloud oers platform
virtualization to the customer. The user is oered a virtual machine
with some storage. Instead of buying servers and other network equip-
ment, users just rent these resources.
4
1.2 Objectives
The goal of this work is to investigate the possibility of combining BPM and
cloud computing. By moving BPMS software to the cloud, companies do
not have to buy and maintain expensive servers to manage and coordinate
their own business process. Instead, the business process can be uploaded
to a service in the cloud that performs and monitors the process.
This report gives an overview of problems and solutions of business process
management in the cloud, by looking at existing solutions and scientic
papers that discuss the topic. The ultimate goal of this work is to identify
research issues for a master assignment.
1.3 Approach
The following steps are taken in order to pursue the dened goals:
1. Study the literature about BPM and identify the phases and tools
within BPM that can benet from cloud computing.
2. Study the literature about cloud computing, in order to identify the
benets and drawbacks of cloud computing in general and also the
specic benets and drawbacks of the 3 service models.
3. Study literature about the combination of both cloud computing and
BPM.
4. Discuss existing products where BPM is already combined with cloud
computing.
5. Devise possible opportunities for further research.
1.4 Structure of the report
Chapter 2 explains business process management by considering the BPM
lifecycle and investigating several tools and languages that are useful in
the management process. Chapter 3 elaborates on cloud computing. The
fundamentals of cloud computing are described and an overview of cloud
platforms is given. Chapter 4 investigates the combination of BPM and
cloud computing by discussing scientic literature about the subject and
by giving an overview of existing software. Chapter 5 identies research
directions for a master assignment. Chapter 6 gives our conclusions.
5
Chapter 2
Business Process
Management
The goal of BPM is to identify the internal business processes of an orga-
nization, capture these processes in process models, manage and optimize
these processes by monitoring and reviewing them.
Business process management is based on the observation that each prod-
uct that a company provides to the market is the outcome of a number
of activities performed [30]. These activities can be performed by humans,
systems or a combination of both. By identifying and structuring these ac-
tivities in workows, companies get insight into their business processes. By
monitoring and reviewing their processes, companies are able to identify the
problems within these processes and can come up with improvements.
This chapter discusses business process management by considering the
BPM lifecycle. Each of the phases of the lifecycle is investigated. The chap-
ter concludes by giving an overview of the BPM phases that are relevant for
this work.
2.1 BPM lifecycle
The BPM lifecycle is an iterative process in which all of the BPM aspects
are covered. The BPM lifecyle is shown in Figure 2.1 and consists of the
following phases:
Design
The design phase consists of identifying existing processes and captur-
ing the business processes in process models.
6

Design
Implementation
Enactment
Evaluation
Figure 2.1: Schematic representation of the business process management
lifecycle [30]
Implementation
In the implementation phase, the designed process is implemented in
an executable process language, which can be deployed in a BPMS.
Enactment
The enactment phase is the runtime phase of the lifecycle. The busi-
ness process is deployed and monitored by a BPMS.
Evaluation
In the evaluation phase the monitored information that is collected
by the BPMS is used to review the business process. The conclusions
drawn in the evaluation phase are input for the next iteration of the
lifecycle.
In the remainder of this chapter, the rst 3 phases of the lifecycle are ex-
plained in more detail. More information about the 4th phase is omitted in
this report, since the activities involved in this phase are not relevant for
this work.
2.2 Design
In the design phase the business processes within a company are identied.
The goal of the design phase is to capture the processes in business process
models. These models are often dened using a graphical notation. In this
way, stakeholders are able to understand the process and rene the models
7
easily. The activities within a process are identied by surveying the already
existing business process, by considering the structure of the organization
and by identifying the technical resources within the company. BPMN[21] is
the most popular graphical language for capturing business process models
in the design phase.
When the business processes are captured within models, these models can
be simulated and validated. By validating and simulating the process, the
stakeholders get insight into the correctness and suitability of the business
process models.
2.2.1 Business Process Management and Notation (BPMN)
BPMN is a notation for dening business process models, developed by
the Business Process Management Initiative [31]. The rst version of the
standard was released in 2004. Recently, in March 2011 BPMN 2.0 was
released. BPMN has been designed to be understandable for all the business
users that have to deal with process diagrams. This means that business
analysts need to be able to create the rst sketches of the models, but
the diagrams also have to be understandable for technical developers who
have to implement the business processes represented in the diagrams and
employees who have to monitor the processes.
Language constructs
BPMN diagrams are modelled as owcharts, consisting of activities that are
connected to each other to determine their relations. The most important
constructs available in the language are:
Event
Events eect the ow of a process. Three types of events are dened
in BPMN, namely Start, Intermediate and End events.
Activity
Activities can be described as work performed by a company. There
are two types of activities: tasks and sub-processes.
Gateway
Gateways are controls that are available for changing sequence ow.
Condition/decision paths are examples of situations that can be mod-
elled by gateways.
Sequence Flow
A sequence ow is used for ordering ow elements, and is used to show
the sequence of activities that are performed in the owchart.
8
Message Flow
Message ows are ow elements that indicate the ow of messages
from one participant to another participant. Message ows are used
to model the communication between two separate ow charts.
Association
Associations are available for associating data, text and other artefacts
with ow objects.
Pool
Pools can be used to represent the participants of a process. Activities
that are related can be grouped into a pool.
Lane
Lanes are partitions within a pool. They can be used for organizing
and categorizing activities.
Example
The following example is adapted from OMGs BPMN examples document
[20]. The BPMN diagram of the example is shown in Figure 2.2. The ex-
ample shows a Business-To-Business-Collaboration between a pizza vendor
and a customer. Both participants are modelled using a pool. Inside these
pools, the internal business process of the participants is shown. The pizza
vendor has 3 employees: a clerk, the pizza chef and the delivery boy. Each
of these employees is represented within the pizza vendor pool by a lane.
The business process of the customer starts when a customer is hungry and
wants to eat a pizza. The customer has to perform two activities, namely
a pizza needs to be selected and ordered by sending a message to the pizza
vendor. The next step for the customer is to wait for the pizza. This has been
modelled by an event based gateway (diamond shaped), which indicates that
two possible events can occur. The rst possibility is that a customer waits
for 60 minutes and still has no pizza. In this case, the customer contacts
the pizza vendor and asks when the pizza will be delivered. After the call,
the customer starts waiting again. In the second case, the pizza is delivered
within 60 minutes. In this situation, the customer pays for the pizza and
eats the pizza. After these activities the customers hunger is satised and
the business process terminates.
The business process of the vendor starts when an order received event
is received by the clerk. The gateway after the event introduces parallelism
to the process. A Bake the Pizza order is given to the pizza chef and the
clerk self is waiting for an incoming Where is my pizza? event. As soon
as the pizza chef nishes his pizza baking activity, he orders the pizza boy
9
to deliver the pizza. The pizza boy delivers the pizza to the customer and
waits for the payment. When the pizza money is received, the pizza boy
gives the customer a receipt and the process terminates.
required, the logistics manager has to take out that insurance. In any case, the clerk has to Iill in a postal label Ior the
shipment. For this scenario, the shown inclusive gateway is helpIul, because we can show that one branch is always
taken, while the other one only iI the extra insurance is required, but IF it is taken, this can happen in parallel to the Iirst
branch. Because oI this parallelism, we need the synchronizing inclusive gateway right behind 'Fill in a Post label and
'Take out extra insurance. In this scenario, the inclusive gateway will always wait Ior 'Fill in a Post label to be
completed, because that is always started. II an extra insurance was required, the inclusive gateway will also wait Ior
'Take out extra insurance to be Iinished. Furthermore, we also need the synchronizing parallel gateway beIore the last
task 'add paperwork and move package to pick area, because we want to make sure that everything has been IulIilled
beIore the last task is executed.
5.2 The Pizza CoIIaboration
This example is about Business-To-Business-Collaboration. Because we want to model the interaction between a pizza
customer and the vendor explicitly, we have classiIied them as 'participants, thereIore providing them with dedicated
pools. Please note that there is no deIault semantics in this type oI modeling, which means you can model collaboration
diagrams to show the interaction between business partners, but also zoom into one company, modeling the interaction
between diIIerent departments, teams or even single workers and soItware systems in collaboration diagrams. It is totally
up to the purpose oI the model and thereIore a decision the modeler has to make, whether a collaboration diagram with
diIIerent pools is useIul, or whether one should stick to one pool with diIIerent lanes, as shown in the previous chapter.
II we step through the diagram, we should start with the pizza customer, who has noticed her stomach growling. The
customer thereIore selects a pizza and orders it. AIter that, the customer waits Ior the pizza to be delivered. The event
based gateway aIter the task 'order a pizza indicates that the customer actually waits Ior two diIIerent events that could
happen next: Either the pizza is delivered, as indicated with the Iollowing message event, or there is no delivery Ior 60
4 BPMN 2.0 by Example, Version 1.0
Figure 5.2: Ordering and deIivering pizza
P
iz
z
a

C
u
s
t
o
m
e
r
Hungry
for pizza
Select a pizza Order a pizza
pizza
received
60 minutes
Ask for the
pizza
Pay the pizza Eat the pizza
Hunger
satisfied
P
iz
z
a

v
e
n
d
o
r
p
iz
z
a

c
h
e
f
d
e
liv
e
r
y

b
o
y
Order
received
Bake the pizza
Deliver the
pizza
Receive
payment
pizza order
receipt
money
pizza
c
le
r
k
,where is my
pizza?"
Calm
customer
Figure 2.2: BPMN diagram of a pizza vendor and a customer [20]
2.3 Implementation
After the business process models are validated and simulated, they have
to be implemented. The implementation of these models can be done in
two ways. One can choose to create work lists, with well dened tasks,
which can then be assigned to workers within the company. This is often
the case when there is no automation within the business process execution.
The disadvantage of working with work lists is that the process execution is
hard to monitor. There is no central system in which process instances are
monitored, this has to be done by each employee within the company that
is involved in the process.
In a lot of situations information systems participate in a business process,
in which case a business process management system (BPMS) can be used.
A BPMS is able to use business process models and create instances of
these models for each process initiation. The advantage of using a BPMS
10
is that the system gives insight into the whole process. The system is able
to monitor each instance of a business process and gives an overview of the
activities that are performed, the time the process takes and its completion
or failure.
Business Process Management Systems need executable business models.
The models dened in the design phase are often too abstract to be directly
executed. Therefore, they need to be implemented in an executable busi-
ness process language, such as BPEL[1]. In addition, collaborations between
business processes can be implemented by using a choreography language,
such as CDL[24]. Below, the dierence between orchestrations and chore-
ographies is explained in more detail.
2.3.1 Orchestration vs. Choreography
An Orchestration describes how services can interact with each other at the
message level, including the business logic and execution order of the in-
teractions from the perspective and under control of single endpoint [27].
BPEL[1] is an XML-based language that can be used to describe orchestra-
tions. The language gives its users the freedom to describe business processes
in two ways: executable or abstract. Abstract processes serve a descriptive
role, whereas executable processes are meant to be executed by an execution
engine. An example of orchestration can be found in the pizza collaboration
example. The example has two pools, that represent separate business pro-
cesses. The activities performed within a pool can be described as a process
in an orchestration language.
Choreography is typically associated with the public message exchanges,
rules of interaction, and agreements that occur between multiple business
process endpoints, rather than a specic business process that is executed by
a single party [27]. CDL[24] can be used for describing choreographies using
an XML-based format. CDL allows its users to describe how peer-to-peer
participants communicate within the choreography. The communication
between the two pools in the pizza collaboration example is an example of a
choreography, where two participants with two dierent business processes
collaborate with each other.
2.4 Enactment
When the business process models are implemented in the implementation
phase, the enactment phase can be started. In this phase the system is used
at runtime, so that each initiation of the process is monitored and coordi-
nated by the BPMS. For each initiation of a process, a process instance is
11
created. The BPMS keeps track of the progress within each of the process
instances. The most important tool within the enactment phase is the mon-
itoring tool, since it gives an overview of the running and nished process
instances. By keeping track of these instances, problems that occur in a
process instance can be easily detected.
2.5 Business Process Management System
Several vendors of Business Process Management software solutions oer
complete suites for modelling, managing and monitoring business processes.
Inside these systems there is a process execution environment, which is re-
sponsible for the enactment phase of the BPM lifecycle [30]. An abstract
schema of a typical BPMS is shown in Figure 2.3.
120 3 Business Process Modelling Foundation
This technological problem is also addressed by enterprise application inte-
gration systems, where adapter technology is in place to cope with this issue,
as discussed in Chapter 2.
In addition, the granularity with which legacy systems provide functional-
ity often does not match the granularity required by the business process. In
particular, legacy systems often realize complex subprocesses rather than in-
dividual activities in a business process. Sometimes, the processes realized by
legacy systems and the modelled business processes are not immediately com-
parable. These issues have to be taken into account when software interfaces
to existing information systems are developed.
One option to solving this problem is developing software interfaces that
make available the functionality provided by legacy systems with a granularity
that allows reuse of functionality at a ner level of granularity. The granularity
should match the granularity required at the business process level.
Depending on the legacy system, its complexity, software architecture,
and documentation, as well as the availability of knowledgeable personnel,
the required eort can be very high. If the need for ner-grained granularity
and ecient reuse of functionality is suciently high, then partial or complete
reimplementation can be an option.
3.11 Architecture of Process Execution Environments
So far, this chapter has discussed the modelling of dierent aspects of a busi-
ness process. This section looks into the representation of a business process
management system capable of controlling the enactment of business processes
based on business process models.
Business Process Environment
Process Engine
Service Provider 1 Service Provider n
Business Process Model
Repository
Business Process
Modeling
. . .
Fig. 3.39. Business process management systems architecture model
Figure 2.3: Schematic representation of a business process management sys-
tem [30]
The tools shown in Figure 2.3 provide the following functionality:
The Business Process Modelling component consists of tools for cre-
ating business process models. It often consists of graphical tools for
developing the models.
Business Process Environment is the main component that triggers
the instantiation of process models
The Business Process Model Repository is a storage facility for storing
process models as created by the modelling component
The Process Engine keeps track of the running instances of process
models. It communicates with service providers in order to execute
activities or receive status updates.
12
Service Providers are the information systems or humans that com-
municate with the process engine. These entities perform the actual
activities and report to the process engine.
2.6 Conclusion
In this chapter we have introduced BPM by considering the phases of the
BPM lifecycle. In this section we identify which of the introduced phases,
tools and languages are relevant for this work.
The use of standardized languages such as BPMN and BPEL is interesting
for this work, since both design and enactment tools can be placed in the
cloud and work with these languages.
In addition, the structure of a BPMS is relevant. The dierent components
within a BPMS and the relation between these components might change
when a BPMS is moved to the cloud.
13
Chapter 3
Cloud Computing
Cloud computing is one of the trending topics in Computer Science. Many
market inuencing players as Microsoft, Google and Amazon oer cloud
computing solutions. The goal of this chapter is to introduce cloud com-
puting from both a conceptual level and a more concrete level. At rst the
general benets and drawbacks of cloud computing are explained briey.
The three common service models are introduced next and for each of these
service models the specic benets and drawbacks are identied. After that,
four dierent cloud types are discussed. In the end of the chapter, three pop-
ular cloud platforms are introduced in terms of their purposes and structure.
3.1 General benets and drawbacks
Cloud computing is a model for enabling ubiquitous, convenient, on-demand
network access to a shared pool of congurable computing resources that
can be rapidly provisioned and released with minimal management eort or
service provider interaction [25].
The idea of cloud computing is that users are oered computing resources
in a pay-per-use manner that are perceived as being unlimited. The cloud
provider does not have any expectations or up-front commitments with the
user and it oers the user elasticity to scale up or down quickly according
to the users needs.
Cloud computing gives organizations several benets:
Elasticity
Instead of having to buy additional machines, computing resources
can be reserved and released as needed. This means that there is no
under- or over-provisioning of hardware by the cloud user.
14
Pay-per-use
Cloud users are only billed for the resources they use. If a cloud user
needs 20 computers once a week for some computation of one hour,
it is only billed for these computing hours. After that the computers
can be released and can be used by other cloud users.
No hardware maintenance
The computing resources are maintained by the cloud provider. This
means that operational issues such as data redundancy and hardware
maintenance are attended by the cloud provider instead of the cloud
user.
Availability
Clouds are accessible over the Internet. This gives cloud users the ex-
ibility to access their resources over the Internet. Cloud users are able
to use software or data that is stored in the cloud not only inside their
organization but everywhere they are provided with Internet access.
There are also drawbacks and threats in using cloud computing:
Security
Data is stored inside the cloud and accessible through the Internet. In
several situations cloud users deal with condential information that
should be kept inside the cloud users organisation. In these situations
cloud computing might not be a good solution, although there are
solutions with cloud computing in which data is stored inside the cloud
users organisation but applications are hosted in the cloud. There are
also technical solutions for making data unintelligible for unauthorized
people, for example, by using encryption algorithms.
Availability
Clouds are accessible through the Internet. This gives cloud users the
freedom to work with the services everywhere they have an Internet
connection. The downside is that when the Internet connection fails,
for example, on the side of the cloud provider, cloud users are not able
to access their services any more. This might lead to business failures,
especially when the services are part of a business process.
Data transfer bottlenecks
Users that use software systems might need to transfer large amounts
of data in order to use the system. Data should be transported not
only from the user to the system, but also to multiple systems in order
to cooperate inside a company. Cloud computing providers do not only
bill the computation and storage services, but also data transportation
is measured and billed. For companies that deal with a lot of data,
cloud computing may be expensive because of the data transportation
costs. Another problem can be the time it takes to transfer data to
15
the cloud. For example, a company needs to upload a huge amount of
data in order to perform a complex computation. The data transfer
may take more time than the computation itself. In these situations
it might be faster and cheaper to perform the computation inside the
premisses of the cloud user.
3.2 Service models
The National Institute of Standards and Technology (NIST) denes cloud
computing by means of three service models: Software-as-a-Service (SaaS),
Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) [25]. The
three service models are closely related and can be seen as a layered architec-
ture, as shown in Figure 3.1. Each service model is explained in the sequel.
For each of the models, the specic benets and drawbacks are given for
both the user and the provider of the service models.











Software as a Service
Examples: Facebook, Youtube, Gmail
Platform as a Service
Examples: Windows Azure, Google AppEngine, Force.com
Infrastructure as a Service
Examples: Amazon EC2, GoGrid
Application
(Applications, Webservices)
Platform
(Software Frameworks, Storage providers)
Infrastructure
(Computation, Storage)
Hardware
(CPU, Memory, Disk, Bandwidth)
Figure 3.1: An overview of the layers in cloud computing based on [32]
3.2.1 Infrastructure as a Service (IaaS)
Infrastructure as a Service is the lowest layer in the cloud computing stack.
As shown in Figure 3.1, IaaS combines two layers: the hardware layer and
the infrastructure layer. IaaS users are interested in using hardware re-
sources such as CPU power or disk storage. Instead of directly oering
these services to the user, IaaS providers provide users with a virtualization
platform. Customers need to install and congure a virtual machine, which
runs on the hardware of the cloud provider. In this model, cloud users are
responsible for their virtual machine and cloud providers are responsible for
the actual hardware. Issues such as data replication and hardware main-
tenance are addressed by the cloud provider, while the management of the
virtual machine is performed by the cloud user.
16
Benets of IaaS for cloud users are:
Scalable infrastructure
The biggest advantage of IaaS is the elasticity of the service. Instead
of having to buy servers, software and data centre capabilities, users
rent these resources on a pay-per-use manner. In situations where the
workload of computer resources uctuates, IaaS might be a good so-
lution. For example, consider a movie company that uses servers for
rendering 3D eects. The company has a small data centre on-premise
which is used for the rendering, once a week. The rendering of one
scene takes 50 hours when calculated on 1 machine. By scaling up to
50 machines, the rendering of the scene would take 1 hour. Scaling
up the internal network of the company might be an expensive oper-
ation considering the installation and maintenance of the machines,
especially when the servers are only used for rendering once a week.
Instead of buying these machines, one might consider to rent the ma-
chines and only pay for the rendering operation once a week.
Portability
Since IaaS works with virtual machine images, porting an on-premise
system to the cloud or porting a virtual machine from one cloud to
another can be easy. This, however, depends on the virtual machine
image format that is used by the cloud provider.
Drawbacks of IaaS for cloud users are:
Virtual machine management
Although cloud users do not have to manage the rented computer
hardware, cloud users are still responsible for the installation and con-
guration of their virtual machine. A cloud user still needs experts
inside its company for the management of these virtual servers.
Manual scalability
IaaS does no oer automated scalability to applications. Users are able
to run virtual machines and might boot several instances of virtual
machines in order to scale up to their needs. Collaboration between
the virtual machines has to be coordinated and programmed by the
cloud user.
Benets for IaaS cloud providers are:
Focus on hardware
Cloud providers are mainly focused on hardware related issues. Every-
thing that is software related, such as database management, threading
and caching needs to be performed by the cloud user.
Exploiting internal structure
Several providers are able to oer cloud computing services as an ex-
17
tension to their core business. For example, the Amazon infrastructure
stack was built for hosting Amazons services. By oering this infras-
tructure as a service, Amazon is able to exploit infrastructure and oer
a new service to its customers at low cost.
Drawbacks for IaaS cloud providers are:
Under- and overprovisioning
Cloud providers have to oer their resources as if they are unlimited to
the cloud user. This means that a cloud provider needs to own enough
resources in order to full the needs of a cloud user. These needs,
however, may vary every time. Underprovisioning of a data centre
causes that a cloud user might not be able to obtain the resources
it asks for, since the cloud provider does not have enough machines
available. Overprovisioning is extremely expensive, since servers are
bought and maintained, but are not used.
3.2.2 Platform as a Service (PaaS)
Platform as a Service is a service model in which users are oered a platform
on which they can develop and deploy their applications. The platform oers
support for using resources from the underlying infrastructure. Platforms
are mostly built for a certain domain, e.g., development of web applications,
and are programming language-dependent.
There are several cloud platforms available nowadays. Microsoft oers the
Windows Azure platform, which can be used for developing (web) applica-
tions and services based on the .NET framework. Googles App Engine is
a platform for the development and deployment of Python or Java-based
(web) applications.
Benets of PaaS for cloud users are:
Development platform
PaaS oers cloud users a platform on which they can manage and de-
ploy their applications. Instead of having to manage important issues
such as scalability, load balancing and data management, cloud users
handle these issues by using services oered by the platform.
No hardware and server management needed
Customers can deploy applications relatively easily on the platform,
since no network administrators are necessary for installing and main-
taining servers or virtual machines.
Drawbacks of PaaS for cloud users are:
18
Forced to solutions in the cloud
PaaS oers combinations of services. For example, Windows Azure
provides users with a .NET environment. The platform oers support
for databases in the form of SQL Azure. Application developers can
choose to use a dierent database, but have to perform dicult op-
erations to install these services on the platform, or have to host the
database by a third party. PaaS users are more or less forced to use
the solutions that are oered by the cloud provider in order to get the
full benets from the platform.
Benets for PaaS cloud providers are:
Focus on infrastructure and platform
The software that runs on the platform is managed by the cloud user
and the cloud provider is responsible for the infrastructure and the
platform.
Drawbacks for PaaS cloud providers are:
Platform development
The platform that is oered by the cloud provider is a piece of soft-
ware. Complex software is needed to oer services such as automatic
scalability and data replication. Faults in the platform can lead to fail-
ure of customer applications, so the platform has to be fault tolerant
and stable.
3.2.3 Software as a Service (SaaS)
With Software as a Service, cloud providers oer an application that is
deployed on a cloud platform. Users of the application access the application
through the Internet, often using a browser. One of the benets of SaaS is
that cloud providers are able to manage their software from inside their
company. Software is not installed on the computers of the cloud users, but
instead runs on the servers of the cloud provider. When a fault is detected
in the software, this can be easily xed by repairing the software on the
server, instead of having to distribute an update to all the users.
There are several examples of Software as a Service. For example, Google
oers several web applications, such as Gmail and Google Docs. Both appli-
cations are oered as an online service. Another example is SalesForce.com,
which oers CRM online solutions as a service.
Benets of SaaS for cloud users are:
Pay-per-use
Instead of having to purchase a license for each user of an application,
19
organizations are billed based on the usage of the software. A couple
of years ago software was often purchased on a pay-per-license base.
Network administrators had to install applications on the workstations
of a cloud users company and for each application instance the cloud
user had to pay for a license, even if the user of a particular workstation
did not use the application. With pay-per-use, cloud users pay only
for the users and the usage time of the application.
Updates
Applications in the cloud are managed by a cloud provider. The cloud
provider is able to perform updates to the software directly in the
cloud. Instead of having to distribute updates to the cloud user, the
users always work with the most actual version since they access the
application in the cloud.
Drawbacks of SaaS for cloud users are:
Data lock-in
Data lock-in is one of the typical problems of SaaS. In case cloud
users decide to work with another application, oered by a dierent
provider, it might be hard to move the data to this other application.
Not every application provider stores data in a standardized way and
interfaces for retrieving all the data from an application may not be
available.
Benets for SaaS cloud providers are:
Maintenance
Maintenance can be directly performed in the cloud application itself.
Updates do not have to be distributed to the cloud users but are
directly applied upon the software in the cloud.
Drawbacks for SaaS cloud providers are:
Infrastructure needed
In traditional software deployment, software is shipped to the user.
The hardware on which the application is installed is managed by the
user. With cloud computing, the software runs on servers of the cloud
provider. This means that cloud providers have to perform infrastruc-
ture maintenance, or they have to rent infrastructure or a platform for
hosting their applications.
Responsibility
Applications that run in the cloud are managed by the SaaS provider.
When the application in the cloud is not accessible or not working any
more because of erroneous updates or changes in the software, cloud
users are not able to work with the software any more. It is a big
20
responsibility for cloud providers to make sure the software is kept up
and running.
3.3 Cloud types
The cloud types identied in [23][25] are discussed below.
3.3.1 Public Cloud
A public cloud is provisioned for exclusive use by the general public. Cloud
users access the cloud through the Internet. Public clouds are widely avail-
able nowadays, for example companies as Microsoft, Google and Amazon
oer public cloud computing services. The biggest benet of public clouds is
that the management of the servers is provided by the third-party provider.
Users just pay for the usage of the cloud and issues as scalability and repli-
cation are handled by the cloud provider.
3.3.2 Private Cloud
Private clouds are for exclusive use of a single organization. Private clouds
can be hosted inside or outside the cloud users organisation and managed
by the cloud users organisation itself or by a third-party provider. This
form of cloud computing can be used when cloud users have to deal with
strict security concerns, where data has to be hosted inside the cloud users
organization itself.
3.3.3 Hybrid Cloud
Hybrid clouds are created by combining a private and a public cloud. With
hybrid clouds, organizations can choose to store their critical data inside
the company using a private cloud, while the less critical data and services
can be stored in the public cloud. The hybrid cloud approach benets from
the advantages of both public and private clouds. Scalability is maintained,
since the public cloud is used for oering the services, while data security is
maintained by storing critical data in the private cloud.
3.3.4 Community Cloud
A community cloud is available for a specic community. Several companies
that deal with the same concerns may decide to host their services together,
21
in order to collaborate. Community clouds can be managed by one or more
organizations within the community, but the cloud may alternatively be
hosted by a third-party provider.
3.4 Cloud providers
Several companies oer cloud services to customers. Below we give an
overview of the three most important players in the cloud market and their
oerings.
3.4.1 Amazon
Amazon was one of the rst companies to oer public cloud solutions to orga-
nizations. At rst, Amazon focused on oering IaaS solutions, but through-
out the years several new services were introduced, which made the Amazon
Web Services platform richer and more than just an IaaS solution. Below
we briey discuss some of the these services.
Amazon EC2
Amazon Elastic Compute Cloud (EC2) [2] is the computational service of
Amazon. EC2 oers computational resources to its users, such as CPU
power and memory. In order to use EC2, a customer has to install a custom
virtual machine in the form of an Amazon Machine Image (AMI) in an
Amazon Virtual Server. Amazon oers pre-installed images, with several
additional software such as LAMP and Hadoop [29][13]. It is also possible
for a customer to upload their own virtual machine image. Virtual machines
can be instantiated by the user and the user is free to bootup multiple
instances of a virtual machine to enlarge the computational power. New
instances can be launched within 2 minutes. The billing of EC2 services is
based upon the number of instantiated virtual machines per hour.
Amazon S3
Amazon S3 stands for Simple Storage Service [4]. S3 oers a storage service
to customers in which they can store objects up to 5 Tb. To organize the
storage, data needs to be stored in buckets with a unique name. Each user
gets 100 buckets. The number of objects inside a bucket is unlimited, the
only restriction within a bucket is that each object needs to have a unique
name. Each object that is uploaded to the service is directly replicated
22
multiple times. Objects can either be private or public. In this way users
can share les over the Internet by making the corresponding object public.
Amazon oers several interfaces for accessing objects. Currently supported
interfaces are REST, SOAP and BitTorrent.
Amazon SQS
Amazon SQS stands for Amazon Simple Queue Service[3]. The goal of
the queue service is to provide a mechanism for exchanging messages be-
tween computers. Developers are able to move data from one component
to another by adding data to the queue of a component, instead of directly
communicating with the component.
Amazon SimpleDB
SimpleDB [5] is a exible and scalable non-relational data store. SimpleDB
oers to developers a datastore in which they can store and query structured
data by using webservice requests. Data is stored in domains. A domain is
a collection of similar data items, comparable to a relational table, although
SimpleDB does not oer relational storage. Inside a domain, data is stored
in collections of key-value pairs. A key-value pair can contain multiple values
and has a maximum of 1024 bytes per value. An entity in a domain can
have at maximum 256 key-value pairs. Data stored inside the datastore is
automatically indexed. This leads to fast query performance when querying
the datastore.
3.4.2 Microsoft Windows Azure
Microsofts solution for cloud computing is called Windows Azure [11][12].
Windows Azure is a public cloud platform that consists of a group of cloud
technologies, each providing services to application developers. The plat-
form consists of four parts: Windows Azure, SQL Azure, Windows Azure
AppFabric and Windows Azure Marketplace. These parts are briey ex-
plained below. An overview of the platform is shown in Figure 3.2.
Windows Azure
The main goal of the Windows Azure part of the Windows Azure platform
is to run applications and store data in the cloud. As shown in Figure 3.2,
The Windows Azure block consists of 5 parts.
23
Figure 3.2: Overview of Microsoft Windows Azure platform
The Compute part presents the computing service of the cloud platform.
The service is able to run .NET applications using a Windows Server foun-
dation. Applications that are built on the compute service are structured
by using one or more roles. There are 3 possible roles inside the compute
service: web roles, worker roles and VM roles. A web role can be used
for running web applications. Each web role instance is provided with a
precongured IIS installation on which the applications can be deployed.
Worker roles are intended for processing operations in the background. Of-
ten web roles delegate tasks to worker roles in order to execute complex
operations. VM roles are the most custom roles. Users need to upload a
Windows Server 2008 image as a custom VM role. This solution is often
useful when an on-premise service is moved to the cloud.
The Storage part contains services for storing data. Three storage services
are provided: blobs, tables and queues. Blobs are binary large objects and
are stored in containers. Blobs are useful for storing unstructured data,
such as images. In case of structured data, tables can be used. Tables
are not to be confused with relational tables, as used in relational database
systems, such as SQL Azure. Tables are represented as a set of entities
with properties. There is no dened schema for a table. Developers can
24
query data from tables by using a simple query language dened by OData.
Queues are not used for the storage of data, but for passing data from one
role to another.
The Fabric controller manages the machines that are available within a data
centre. The goal of the controller is to divide the jobs and roles over the
available machines and to manage data storage and data replication on these
machines.
Connect is an integration service for the integration of on-premise services
with the Azure platform. For example, a company wants to move an ex-
isting application to the cloud but decides to keep their database system
on-premise. Connect helps organizations to deal with these situations.
The Content Delivery Network service is used for improving performance
on data access. The service keeps track of often used data and caches it at
sites closer to the clients that use it.
SQL Azure
SQL Azure is Microsofts cloud-based database solution and is based on the
Microsoft SQL server. It does not only consist of a database solution, but
it also oers services for reporting and data synchronization. SQL Azure
Reporting provides users with data reports, based on the stored data in the
database. Created reports can be published in a reporting portal, in which
users can access the reports. The Data synchronization service provides a
mechanism for synchronization of multiple databases, for both on- and o-
premise databases. The synchronization service is based on the Microsoft
Sync Framework.
Windows Azure AppFabric
The AppFabric can be seen as a layer above the Windows Azure part. App-
Fabric adds support for application access, access control and caching.
The rst component inside the AppFabric is a service bus. The service bus
oers clients a registry in which they can publish their services. Other clients
are able to nd and use services, by querying the registry.
Often, applications make use of digital identities. Examples of these identi-
ties are Active Directory, Facebook, Google accounts or Windows Live ID.
AppFabric oers support for several dierent digital identities in such a way
that users can use them relatively easily within their applications.
25
AppFabric also oers a caching service for caching frequently accessed in-
formation. The service is used for reducing the number of database queries,
which leads to better performance of the overall platform.
Windows Azure Marketplace
In the marketplace, Microsoft oers existing cloud applications and cloud
data. The marketplace has been split up into two parts: DataMarket and
AppMarket. The DataMarket oers organizations with datasets. Organiza-
tions can browse through the oered sets and purchase the data they need
for their applications. The AppMarket can be used for companies to share
their cloud applications, so that other organizations can use and integrate
them into their own applications.
3.4.3 Google App Engine
Google App Engine [19] is Googles cloud platform. The platform lets cus-
tomers run web applications on Googles infrastructure. The platform cur-
rently supports three programming languages: Python, Java and Go.
The App Engine was developed with two goals in mind [16]: (1) to reduce
the time for deployment for web application developers. Traditionally, de-
velopers do not only spend a lot of time programming their application,
but deployment and maintenance of applications also takes its share. By
providing users with an already congured infrastructure and several pro-
gramming APIs, developers can reduce their deployment and maintenance
costs. (2) to provide developers with a system that scales automatically to
the needs of the application. Users are able to let their application scale
from a simple experimental project towards a fully mature software appli-
cation. Scaling is performed automatically by the App Engine, but within
denable boundaries.
Sandboxed applications
Web applications run in a sandboxed environment. Users are provided with
limited access to the underlying operating system. In this way Google is able
to distribute the web requests across multiple servers. This limited access
leads to several restrictions for the developer:
Applications are only able to communicate with other computers using
the provided URL fetch or email services.
26
External applications and computers can only access the web applica-
tion via HTTP/HTTPS requests on the standard ports.
Web applications can not write to the le system. Instead Google
oers a datastore and a memory cache service.
An application can only execute code during a request. As soon as a
request comes in, the application is able to start processing data until
the response is sent back to the user. The maximum request-response
time is 60 seconds.
Datastore
Google oers developers a distributed data storage service called BigTable
[10]. BigTable is not a traditional relational database, since it is column-
based instead of row-based. The table itself is schemaless and consists of
entities of a certain kind and a set of properties.
The datastore can be queried by using Googles query language GQL [18],
which is an advanced query language optimized for retrieving entities from
the datastore.
Services
Google App Engine oers several services that can be used by cloud users
applications, like a mail service for sending mails to customers, an URL Fetch
service for accessing resources on the internet or communicating with other
services, a memory cache that provides the developer with high performance
key-value caching.
Applications in the Google App Engine run only during the request-response
cycle. However, this can be circumvented by scheduling tasks using the Cron
service that is available on the App Engine platform.
Administration console
In order to manage the web application, Google oers an administration
console to developers [17]. The console is an online dashboard in which the
user can logon to monitor and congure the applications that are running
in the cloud. The console gives the user the opportunity to monitor the
performance of the application and also to view and edit the data that is
stored in the datastore.
27
3.5 Conclusion
Three popular cloud platforms have been introduced in this chapter. All
three the platforms are public clouds, but the strategies of the platforms
dier. The services oered by Amazon are mainly on the IaaS level.
Microsofts Windows Azure and Google App Engine are both based on the
PaaS service model. Google App Engine focuses on web applications and
oers a SDK for creating cloud based web applications. Applications can
only execute code during a web request, which has a maximum execution
time of 60 seconds. Compute-intensive operations that need more time to
execute can only be executed by scheduling them as a task in the Cron
service.
Windows Azure is more than a web application platform. In addition to
Google App Engine, Azure can also host non-web applications and pure
computational applications. Windows Azure has support for custom VM
roles, in which cloud users can install and manage their own Windows based
VM. By oering this service, Microsoft does oer partial IaaS functionality.
From a database point of view, both Google and Amazon oer schemaless
column-based storage to the cloud user, whereas Microsoft oers a relational
row-based database solution.
Choosing between the presented cloud providers is dicult and depends on
the needs of a user. When a cloud user needs full control over the server
settings, an IaaS solution is necessary. When a cloud user, wants to deploy
software on top of a platform, a platform can be chosen based upon the
programming language and database needs of a cloud user.
28
Chapter 4
BPM and cloud computing
In this chapter we investigate the state-of-the-art of the combination of BPM
software and cloud computing, by discussing relevant scientic literature and
some existing commercial products.
4.1 General benets and drawbacks
Cloud-based BPM gives cloud users the opportunity to use cloud software
in a pay-per-use manner, instead of having to make upfront investments on
BPM software, hardware and maintenance [22]. Systems scale up and down
according to the cloud users needs, which means that the user does not have
to worry about over-provisioning or under-provisioning.
BPM in the cloud has also several downsides. By putting a BPMS in the
cloud, cloud users might lose control over their sensitive data. The eciency
and eectiveness of activities that are not computation-intensive may not
increase by placing these activities in the cloud, but on the contrary, these
activities may become more expensive. For example, an activity that is not
computation-intensive might need to process a certain amount of data. The
transfer of the data to the cloud might take longer than when the activity
is executed on-premise, besides, the costs of the activity may increase since
data transfer is one of the billing factors of cloud computing.
4.2 Service Models
In [6], moving an existing application from on-premise to each of the service
models is considered. In the analysis, the application is supposed to be
orchestrated by a BPEL specication and executed by a BPEL execution
29
engine. The requirements and challenges that have to be faced when moving
an application to each of the service models are briey discussed below, based
on [6].
4.2.1 IaaS
When an application is moved to the IaaS service model, the cloud user is
responsible for the operating system, the middleware and the applications
running in the virtual machine, as shown in Figure 4.1. Installing BPM
software in an IaaS cloud solution is therefore comparable to installing BPM
software on-premise, since everything except the hardware is managed by
the cloud user. In addition, the cloud user has to take certain security
measures to secure the system from intruders. Possible security measures
are blocking ports, enforcing access control policies and keeping the software
and operating systems up to date.
SaaS(Salesforce)
PaaS (Google's App Engine)
Flexibility for
Customer
Complexity for
Provider
IaaS (AmazonEC2)
Figure 1. Overview of delivery models
selection of the middleware and application components.
By contrast the complexity for the provider decreases
from SaaS (highest) to IaaS (lowest). The comparison of
the different delivery models will be discussed in more
detail in Section 4.
Right now there is no approach enabling execution
of business processes through cloud computing, because
the delivery models are not geared specically at offering
BPM in the cloud and that is where our approach kicks
in.
3. Running Example
We introduce the example of a ctional travel agency
named Paradise Travel, which offers travel booking
through a Web interface to potential customers. The
offering of Paradise Travel comprises complete travel
booking including transport to holiday resort, hotel book-
ing as well as optional car rental service. The whole
travel booking process is modeled as business process in
BPEL. External partners involved in the business process,
e.g. several car rental agencies located at the different
available travel destinations as well as the credit inves-
tigation company for checking solvency of customer
before credit card payment, are contacted through Web
services.
Currently the infrastructure and the platform required
for Paradise Travel, e.g. an application server including
deployed Web services as well as the business process
engine including deployed travel booking process, are
hosted in an enterprise datacenter owned and maintained
by the Paradise Travel company (on-premise).
4. BPEL in the cloud
In this section we will take a look on how to outsource
several parts of Paradise Travels business by applying
the delivery models IaaS, PaaS and SaaS. Along the
way we will consistently refer to the example described
before.
4.1. IaaS
Although outsourcing the hosting and maintenance
of the infrastructure required for the business of Paradise
Travel to an IaaS provider reduces the costs for the travel
booking company, the setup of the application server
and business process engine as well as thedeployment of
Web services and travel booking business process is still
the task of Paradise Travel. So applying IaaS is the rst
step to concentrate on the core business.
4.1.1. Providing BPEL through IaaS. In terms of the
requirements for a BPEL engine, IaaS (Figure 2) is very
close to the traditional on-premise model. Customers
have to make the same decisions regarding the instal-
lation of software such as operating system, platform
middleware and application. Of course, this decisions
must comprise security considerations such as blocking
out attackers by locking ports, patching the operating
system, running an anti-virus software, etc., as well as
conguration and enforcement of access control policies.
Customer
Applications BPELProcesses ProcessModels
Process
Instances
Middleware BPELEngine DBMS
Provider
OS
Provider
Hardware
Figure 2. IaaS
In essence, IaaS realizes on-premise in the cloud by
moving the responsibilities for hosting the infrastruc-
ture (e.g. hardware) from their own datacenters to an
outsourcing provider. Thus, there are no special require-
ments or challenges that must be solved to be able to
provide BPEL through IaaS, except the installation of a
BPEL engine.
4.2. PaaS
Applying PaaS for outsourcing most of the tasks not
part of the core business of Paradise Travel, e.g. hosting
and maintenance of infrastructure as well as platform
middleware, reduces the complexity for Paradise Travel
compared to IaaS. On the one hand it is still the task of
the travel booking company to provide the Web services
to be deployed on the application server as well as the
travel booking process to be deployed on the process
engine. On the other hand this is not only a disadvantage
but also the possibility to keep a part of the exibility of
the travel booking company by enabling the adaption of
the business process, in case the needs of the customers
672
Figure 4.1: Responsibilities for cloud users and cloud providers when a
workow based application is moved to the IaaS service model [6]
4.2.2 PaaS
By placing a workow-based application on the PaaS service model, the
responsibilities for both the cloud provider and cloud user change, as shown
in Figure 4.2. The execution engine is assumed to be part of the platform in
this case and is oered by the provider. Users need to upload their processes
in order to run them in the cloud. The engine can be used by multiple users,
since the platform is shared. The responsibility of data storage and data
management is no longer in hands of the cloud user, which leads to several
security issues.
The following three requirements need to be realized in order to oer a
secure BPEL engine on PaaS:
30
concerning travel booking changed or in case the agency
providing hotel information and hotel reservation is no
longer available and another provider has to be found.
4.2.1. Providing BPEL through PaaS. In contradic-
tion to the IaaS deployment model, PaaS providers host
hardware, operating system and platform middleware
(Figure 3) such as a BPEL engine and a database man-
agement system (DBMS).
Customer
Applications BPELProcesses ProcessModels
Process
Instances
Provider
Middleware BPELEngine DBMS
OS
Hardware
Figure 3. PaaS
Because everything except the business process
model is hosted, the customer must trust his provider
on the basis of external audits or security certicates,
having expertise in protecting the system at hardware
and operating system level.
Although a customer may trust the provider at the
hardware or operating system level, one of the main
show-stopper is the perceived lack of data condentiality
by means of data loss or even sellout of customer data.
For example the on-demand cloud computing service
FlexiScale
6
has been ofine for several days because an
employee accidentally deleted one of the main storage
volumes [23].
Two main distinctions between the forms of data
hosted at an outsourcing provider must be made: The
rst form is data, which describes the business process it-
self such as process models. The second form is the data
that is processed by the business processes, such as cus-
tomer information and order processing data. Because
administrators can simply gain access to the business
process models or the business process instance data in
the underlying databases, it becomes easier to steal the
assets of the enterprise.
To increase customers condence and trust in out-
sourcing and the provider itself, unauthorized disclosure
must be impossible by design. Therefore we propose an
architecture of a secure BPEL engine which ensures that
the information about an asset of the enterprise such as
process models and customer data is not inexpediently
used. In the following section the security requirements
to prevent unauthorized disclosure of data are analyzed
and a possible realization in terms of requirements to a
6
http://www.exiscale.com/
secure BPEL engine architecture is presented.
4.2.2. Requirements and Challenges. Business pro-
cess management (BPM) is a evolutionary process that
involves modeling, implementation, execution, moni-
toring, assessment and re-design of business processes.
When outsourcing processes to a PaaS provider, the
provider is responsible for the execution and partially
monitoring of these processes. Before a business process
can be executed, the process model has to be deployed to
the providers engine. The deployment process includes
uploading of the process model and its nal installation
to the process engine. To ensure condentiality and
integrity of the enterprise assets that are implicitly re-
ected by the process model and process instance it is
necessary to realize the following security requirements:
It must not be possible to read the process model
for somebody who gets hold of a process model
description le such as a BPEL le.
It must not be possible to alter the process model
(and if yes, nd out that it has been done)
It must not be possible to deploy the process model
on another (maybe corrupted) engine
Therefore a means to encrypt and sign parts of pro-
cesses or complete processes is needed. These kinds of
processes are further referred as obfuscated processes.
Execution of an obfuscated process requires a modied
process engine that implements additional features such
as a public key infrastructure (PKI). Thus, each engine
has a unique private key and only accepts obfuscated
processes that are encrypted or signed using one of the
provided public keys. Others are rejected. The private
key of the engine is unknown even to the administrator.
Employing a PKI, the engine must provide the neces-
sary interface to retrieve the public key that can then be
used to encrypt the process, or publish it to a certica-
tion authority such as VeriSign
7
, which can furthermore
certify that the provider uses a secure BPEL engine. Al-
though the PKI implicitly protects processes of being
deployed to an unintended engine, it does not yet detect
possible tampering with the engine through the provider.
It must not be possible to corrupt the process or engine
to log information in an undesired way. Therefore the
compliant engine must be self-signed with its private key
to detect modications to its source.
Since BPEL is based on XML we propose to use
the two W3C Recommendations XML-Encryption [30]
and XML-Signature [31] to ensure secrecy, integrity
and authenticity in BPEL processes. To apply these
7
http://www.verisign.com/
673
Figure 4.2: Responsibilities for cloud users and cloud providers when an
workow based application is moved to the PaaS service model [6]
Process models should not be readable for intruders who get in pos-
session of a process model description le;
Process models cannot be altered by intruders;
Process models cannot be deployed on other engines by intruders.
In order to achieve these requirements, process model descriptions have to
be encrypted and signed. Encryption ensures that the process models are
not readable for intruders. By signing, one can ensure that a le is only valid
for one particular execution engine and that porting to another execution
engine leads to failure.
Database storage may also be a problem. Often, BPEL engines deploy pro-
cesses by reading process model description les and creating relational data
in tables. Not only the process model itself, but also the process instance
information is stored in the database. The data in these databases has also
to be encrypted in order to be unreadable for intruders. The problem of
data encryption in databases is that it leads to a restriction of query expres-
siveness with respect to relational operators. For example, performing join
operations might be dicult when values in the database are encrypted [6].
4.2.3 SaaS
By moving the application to the SaaS service model, the cloud provider is
now also responsible for the application itself. The application is no longer
an asset of the cloud users enterprise, but is oered by the cloud provider,
as shown in Figure 4.3. The application can be oered to multiple cloud
users using a single-tenant or multi-tenant architecture. In a single-tenant
architecture a new BPEL engine is installed for each process model. In
a multi-tenant architecture, multiple cloud users and process models are
31
served by one BPEL engine. The data that is stored by the cloud provider
should be secured to prevent unintended access by the SaaS provider or
by other cloud users. The security measures, as explained in the previous
section, can be applied to resolve this issue.
tem, platform middleware, and even the application itself
(Figure 4). At this point a paradigm shift can be observed.
The process no longer represents an asset of the tenants
enterprise and is even not visible at all to the tenant.
Customer
Provider
Applications BPELProcesses ProcessModels
Process
Instances
Middleware BPELEngine DBMS
OS
Hardware
Figure 4. SaaS
By serving an application to multiple customers it
has to be distinguished between single-tenant and multi-
tenant architectures. When providing BPEL through
SaaS, a single-tenant architecture implies the installation
of one BPEL engine (and DBMS) not only for each
tenant but also for each process model. By relying on a
multi-tenant architecture, it is possible to serve multiple
tenants with a single BPEL engine (and DBMS) hosting
multiple business processes. In both scenarios a tenant
request triggers the creation and execution of an instance
of the requested business process. Thus the accumulating
data of the respective tenants must be protected against
unintended access by the SaaS provider as well as other
tenants if using a multi-tenant architecture.
While the rst security requirement can be easily
solved by reusing the proposed modications to BPEL
engines as described in Section 4.2.2, the protection
of information against other tenants sharing the same
resources needs to be further analyzed.
4.3.2. Requirements and Challenges. Multi-tenant ar-
chitectures [7] distinguish between three approaches to
realize multi-tenancy at data level. These approaches
can be further distinguished by means of the degree of
data isolation reaching from separate databases to sep-
arate or shared schema in a common database. While
data isolation is very important it can be observed that
it correlates with the required system resources [12] in
terms of memory-footprint, used disk space, CPU usage
and socket allocation. Utilizing the separate databases
approach the data of each tenant is physical isolated.
The separate schemas approach ensures logical isolation
of each tenants data. Both approaches are applicable
for using existing database access control mechanisms.
Therefore the BPEL engine must be extended to estab-
lish a tenant context when a process instance is being
created. A tenant context is dened analogous to an au-
thentication context [25], encapsulating the identity of a
tenant as well as a reference to the schema location. The
engine must be able to map the tenants context to the
appropriate authenticators for database access. The navi-
gator component of the BPEL engine has to be modied
to use this context for any database operation.
However, this solution is not directly applicable if the
tenants share the same schema. As the same tables are
used for the data of multiple tenants, there is neither a
logical nor physical separation. Assumed that a BPEL
process includes a weak correlation mechanism it is
possible that requests to process instances sharing the
same correlation values get mixed up and responses
containing condential data are exposed to unintended
tenants. A straight forward solution would be to add a
column containing a tenant identier [7] to each table. In
the ideal case a DBMS is able to dene access control at
the level of rows by means of tenant identiers. Because
this kind of functionality would restrict the set of suitable
DBMS too much, the concept of a tenant context as
discussed above is needed. Furthermore the navigator
component has to be extended to restrict the possible
query results to the current tenant.
Another solution to the problem of serving multiple
customers is to deviate from the multi-tenancy approach
and dynamically redeploy a business process for every
tenant. Thus a process instance and its corresponding
process model are bound to one single tenant, still shar-
ing the same resources at the DBMS level. Then there
is no longer a risk to expose data to other tenants by
mistake, because now each process has its own endpoint
which serves as a unique identier to distinguish the
process instance information.
5. Analysis and First Evaluations with Ex-
isting Engines
Given the detailed analysis above of the different re-
quirements for BPEL engines regarding the different
delivery models we give an overview of the ndings
and a deeper analysis of what is required from future en-
gines to support all kinds of delivery model. Today, open
source and commercial BPEL engines such as Apache
ODE, Active BPEL
9
, IBM WebSphere Process Server
or Oracles BPEL engine and others are not explicitly
developed with multi-tenancy in mind. They are more
geared towards the on-premise market. We have made
rst experiments in combining the Apache ODE and Ac-
tiveBPEL engines and IaaS. Therefore we installed those
engines on Amazon EC2
10
. We bundled respective Ama-
zon EC2 images that allow us to quickly start up servers
9
http://www.activebpel.org
10
http://aws.amazon.com/ec2/
675
Figure 4.3: Responsibilities for cloud users and cloud providers when an
workow based application is moved to the SaaS service model [6]
In a multi-tenant architecture, multiple cloud users use the same BPEL
engine. The data used by one cloud user should not be accessible by other
cloud users. As a solution, one can choose to create databases for each cloud
user, or to add a column to each database table where the identier that
uniquely identies the user is stored.
4.3 Combining on-premise and cloud
Privacy protection is one of the barriers for performing BPM in the cloud
environment. Not all users want to put their sensitive data in the cloud.
Another issue is eciency. Compute-intensive activities can benet from
the cloud because of the scalability of the cloud. Non-compute-intensive
activities, however, do not always benet from cloud computing. The per-
formance of an activity that is running on-premise might be higher than in
the cloud because of data that needs to be transferred to the cloud rst in
order to perform the activity. These activities can also make cloud com-
puting expensive, since data transfer is one of the billing factors of cloud
computing.
32
4.3.1 Architecture
In most BPM solutions nowadays, the process engine, the activities and
process data are placed on the same side, this is either on-premise or the
cloud. The authors of [22] investigate the distribution possibilities of BPM
in the cloud by introducing a PAD model, in which the process engine,
the activities involved in a process and the data involved in a process are
separately distributed, as shown in Figure 4.4. In this gure, P stands for the
process enactment engine, which is responsible for activating and monitoring
all the activities, A stands for activities that need to be performed in a
business process, and D stands for the storage of data that is involved in
the business process. By making the distinction between the process engine,
the activities and the data, cloud users gain the exibility to place activities
that are not computation-intensive and sensitive data at the user-end side
and all the other activities and non-sensitive data in the cloud.
The PAD model, introduced in [22], denes four possible distribution pat-
terns. The rst pattern is the traditional BPM solution where everything is
distributed at the user-end. The second pattern is useful when a user already
has a BPM system on the user-end, but the computation-intensive activities
are placed in the cloud to increase their performance. The third pattern is
useful for users who do not have a BPM system yet, so that a cloud-based
BPM system can be utilised in a pay-per-use manner and non-computation-
intensive activities and sensitive data can be placed at the user-end. The
fourth pattern is the cloud-based BPM pattern in which all elements are
placed in the cloud.
Yan-Bo Han et al.: A Cloud-Based BPM Architecture 1159
BPM, some problems should be made clear, for exam-
ple, where to enact processes, where to execute activ-
ities, as well as where to store the data produced and
consumed by activities. In this regard, we need to in-
vestigate architecture patterns and identify an optimal
pattern to utilize resources on both sides synthetically
and suciently.
3.1 Design Tradeo
Every coin has two sides. There is a contradiction
between cloud computing and user-end distribution. If
user-end distribution is allowed, then some advantages
of cloud computing such as zero installation may be sac-
riced. Thus, design tradeo is inevitable when build-
ing a cloud-based BPM system supporting user-end dis-
tribution.
To study the problem of user-end distribution ana-
lytically and synthetically, we propose a PAD (Process-
enactment, Activity execution and Data storage) model
which describes BPM architectures from the distri-
bution of three independent functions. Fig.1 demon-
strates the candidate architectures according to the
PAD model.
These architectures can be classied into four princi-
pal types: traditional standalone BPM, user-end BPM
with cloud-side distribution, cloud-based BPM with
user-end distribution and exiting cloud-based BPM.
Fig.1. Dierent patterns of BPM constellation.
Pattern 1 is the traditional architecture, which dis-
tributes everything at user-end.
Pattern 4 is adopted by existing cloud-based BPMs,
which distributes everything on cloud. The advantage
is that users do not have to install anything at the
user-end. However, users of this kind of system will
lose control of their data and cannot utilize user-end
resources.
For users who already have full-edged BPM engines
at their-end, Pattern 2 can be a good choice. They just
need to distribute some compute-intensive activities to
cloud-side for acquiring stronger capabilities and better
performance.
For users who do not have their own full-edged
BPM engines at user-end, the ideal style is Pattern 3.
Process engine is on cloud-side, but process designers
can specify their distribution requirements of activity
execution and data storage. For example, sensitive data
and non-compute-intensive activities can be distributed
at user-end, and compute-intensive activities and non-
sensitive data can be distributed on cloud-side.
3.2 Separation of Control Data and Business
Data
In traditional centralized BPM systems, data should
be stored in a place, where the process engine can have
easy access. Therefore, if we deploy them into cloud
directly, all business data related to certain process
should be stored on cloud-side, which violates our inten-
tion of protecting data privacy. To protect the sensitive
data, a novel BPM architecture supporting Pattern 2 or
Pattern 3 should be in place to decouple process engine
and business data.
Typically, a business process is composed of activi-
ties or tasks that involve people, services, and data. An
activity may be a service invocation or a human task,
and activities are orchestrated to make a process that
is often represented as ow chart. There are two pri-
mary perspectives of a business process: control-ow
perspective and data-ow perspective
[17]
. The control-
ow perspective regulates which activity is being per-
formed, and which is the next activity to be executed.
The data-ow perspective regulates how data is for-
warded from one activity to another and data mapping
issues.
Process engines usually have to handle both control-
ow and data-ow. They use control data, such as
activity status, switching condition value, and process
status to determine the control-ow, and move business
data from one activity to another to ensure data-ow.
In fact, during process enactment, it is possible for a
process engine not to access some business data. In
order to protect users data privacy, we must relieve
cloud-side engine from dealing with data-ow. As a
result, the cloud-side process engine only focuses on
handling control-ow according to the control data. So
the business data, which only need to be exchanged be-
tween user-end activities, does not have to be accessible
for the cloud-side engine.
Figure 4.4: Patterns for BPM placement [22]
33
Business processes consist of two types of ows, namely a control-ow and a
data-ow. The control-ow regulates the activities that are performed and
the sequence of these activities, while the data-ow determines how data
is transferred from one activity to another. BPM workow engines have
to deal with both control-ows and data-ows. A data-ow might contain
sensitive data, therefore, when a BPM workow engine is deployed on the
cloud, data-ows should be protected.
In the architecture proposed in [22], the cloud side engine only deals with
data-ow by using reference IDs instead of the actual data. When an activity
needs sensitive data, the transfer of the data to the activity is handled under
user surveillance through an encrypted tunnel. Sensitive data is stored at
the user-end and non-sensitive data is stored in the cloud. An overview of
the architecture proposed in [22] is shown in Figure 4.5.
1160 J. Comput. Sci. & Technol., Nov. 2010, Vol.25, No.6
Fig.2. Architecture of cloud-based BPM with user-end distribution.
Dealing separately with control data and business
data can also enhance the stability of our system. On
the one hand, when user-end crashes, the process in-
stance on the cloud-side can be suspended for the pro-
cess instance status is maintained on cloud-side. After
user-end resumes, the process instance can continue.
On the other hand, users sensitive data will not be
inuenced when the process engine on cloud-side is un-
available, because the data can be stored in a local
repository that is under their own control.
3.3 Architectural Rationales
Our design objectives of cloud-based BPM with user-
end distribution are as follows. Firstly, the cloud-
side engine handles process enactment by collaborating
with the user-end engine. These two engines can han-
dle activity execution and data storage on their own
side. Secondly, between cloud-side and user-end, there
mainly exists control data such as activity status or ser-
vice request, which does not contain business data but
reference ID. Lastly, business data exchanged between
cloud-side and user-end is also allowed but should be
under users surveillance through encrypted tunnel and
could be charged based on the amount of data trans-
ferred.
Fig.2 illustrates the novel cloud-based BPM, which
has an event-driven architecture supporting user-end
dependency and autonomy while maintaining logic in-
tegrity of an overall business process. As shown in
Fig.2, the solid arrow represents control-ow, and the
wide arrow represents data-ow. The non-sensitive
data is stored in the cloud repository, and users
sensitive data, such as some business documents or
condential nancial reports, is stored in local repo-
sitory under their own control. There are mainly three
components (portal, user-end engine, and local repo-
sitory) installed at user-end, which could be a normal
PC. The cloud-side engine with activity scheduler is
built on large server clusters, which feature high per-
formance and scalability.
With this architecture, it is no longer necessary for
the sensitive data to be accessed by the cloud-side en-
gine especially when those data only need to be ex-
changed between user-end activities. When activities
distributed on cloud-side want to use the data in user-
end repository, the cloud-side engine must get autho-
rized by the user-end engine to obtain them.
Users that need the cloud-based BPM just have
to deploy these user-end components on their private
server, and then get the benets of full-edged BPM
system without losing control of their sensitive data.
Moreover, they can also make some further develop-
ment on the basis of these components to satisfy their
specic needs.
4 Key Issues with Scientic Exploration
In this section, we describe how we can ensure that
the cloud-side engine collaborates seamlessly with the
user-end components to maintain logic integrity of an
overall business process, and also discuss our optimal
distribution mechanism and privacy protection issues
in more detail.
4.1 Communication Between Cloud-Side and
User-End
In the communications between cloud-side and
user-end, six types of event are dened as carriers of
Figure 4.5: Architecture of a cloud-based BPM system combined with user-
end distribution [22]
4.3.2 Optimal distribution
The costs for using cloud computing are investigated in several articles [9,
14]. In [22], formulae are given for calculating the optimal distribution of
activities, when activities can be placed in the cloud or on-premise. The
calculation takes into account the time costs, monetary costs and privacy
risk costs. By using these formulae, cloud users can make an estimation of
the costs of deploying parts of their application on-premise and in the cloud.
34
4.4 Social BPM
Social software supports the interaction of human beings and the production
of artefacts by combining the input from independent contributors with-
out predetermining how to do this [28]. An example of social software
is wikipedia, where independent contributors are able to adjust the doc-
uments in the system. Business processes can benet from social software,
since users involved in the business processes can easily share their knowl-
edge and information by using social software. Not only business processes,
but also BPM itself can benet from social software. By oering BPM soft-
ware as social software, multiple users can work on the design, operation
and improvement of a business process simultaneously, and collaboration is
supported since knowledge is shared through the software.
One of the issues in BPM is the Model-Reality divide issue. Abstract process
models and executable process models are often separated. In the design
phase, processes are modelled in by business designers. When the models are
passed onto the business implementers, the implementers might decide to
implement the processes dierently then dened by the business designer,
simply because not all the details are provided in the design or certain
design decisions have not been made. This leads to inconsistency between
the designed process models and the executed process models. By involving
process implementers and process users in the design process, designers can
obtain more information from the implementers and process users, and both
parties can react on design decisions that are made by the designers.
Process improvement can also benet from social software. During the pro-
cess evaluation phase, not all information inside the process might be taken
into account because of the information pass-on threshold [28]. The informa-
tion pass-on threshold means that users might not share all the information
with the designer. This might be the case when a user thinks it is too
much eort to document all the information, or when the user thinks that
the information is not useful for the designer. Social software can provide
the users with interfaces for sharing the information and possibilities for
discussing the information with a process designer.
Below we give an overview of possibilities that are provided by social software
for each of the BPM lifecycle phases, as introduced in Figure 2.1:
Design
In the design phase, not only process designers but also process users
are able to dene requirements and comment on process ideas. By
collaborating with process users, process designers get better insight
into the process, and users agree with the end result, since they have
participated in the design of the process, instead of being forced to
35
commit to an imposed business process.
Implementation
In the implementation phase, social software may be used for sharing
issues with multiple implementers and process users. Time scheduling
of deployment can also be broadcasted using social software.
Enactment
In the enactment phase, social software can be used for optimizing
the communication between process users. Tasks can be spread by
using an online dashboard. Information can be passed on through the
dashboard, and tasks and comments can be exchanged using mobile
devices.
Evaluation
The evaluation phase can be improved by giving process users access
to this phase. Process users can share what they want to change in
the process and other users can comment and propose solutions for
these issues.
Tools that can be used for enabling BPM as social software are shared
modelling tools, blogs, wikis, fora, user rating systems and data exchange
services, as proposed in [28].
4.5 Environments
On the internet, about 15 BPM cloud providers can be found. The products
that are oered by the providers dier from complete BPM suites, to simple
modelling tools for business processes. In this section we give an overview
of available systems and we briey discuss some of these products, to give
an idea of the variety of products available.
4.5.1 Product overview
Table 4.1 gives an overview of some of the BPM cloud products currently
available on the market. For each of the cloud products the cloud service
models and the cloud types were identied.
4.5.2 Oracle Fusion
Fusion is a middleware solution oered by Oracle [26]. With Fusion, en-
terprises can create their own private cloud, based on the PaaS and SaaS
36
Product Service Model Cloud Type
Oracle Fusion PaaS, SaaS Private
Appian BPM Suite PaaS, SaaS Private or Public
Fujitsu InterstageBPM.com PaaS, SaaS Private or Public
Barium Live SaaS Public
Pega Business Process Cloud SaaS, limited PaaS Public
PNM Soft Sequence Cloud BPM SaaS Public
Elite BPM Cloud IaaS, PaaS, SaaS Public
Billsh BPM Paas, SaaS Private or Public
Cloud Harbor BOP Paas, SaaS Private or Public
Cordys PaaS Private
Intalio BPMS IaaS, PaaS, SaaS Private
Adeptia On Demand BPM SaaS Public
Table 4.1: Overview of BPM cloud products
service model. The middleware has support for running SOA-based applica-
tions. In addition, the middleware oers the Oracle BPEL Process Manager
in which business processes can be composed and deployed.
4.5.3 Appian
Appian BPM Suite [7] is a complete BPMS running in the cloud. The suite
consists of cloud, mobile and social BPM solutions [8]. Appian oers both
on-premise and cloud-based BPM, and gives users the freedom to port their
solutions between cloud and on-premise. The suite oers, amongst others,
a process engine, content sharing, discussion boards, SOA integration tools
and sharepoint.
Appian BPM Suite is considered to be a social BPM, since all process users
can participate in the design, enactment and optimization phase of the BPM
process. Users can discuss their ndings in discussion boards and can share
content with each other by using the cloud. In addition, Appian oers
mobile applications to users. These applications notify users when a new
task comes in, or gives users a reminder when a certain activity failed or
needs attention.
Appian oers several security guarantees to their customers: 99.5% uptime,
SAS-70 Type II infrastructure audit reports, SAML and LDAP/AD integra-
tion for secure authentication, SSL encryption for communication between
systems and compliance with national data privacy laws through local host-
ing.
37
4.5.4 Fujitsu InterstageBPM.com
InterstageBPM.com is the BPM platform of Fujitsu [15]. The BPM platform
is available as a cloud service or as a standalone product which can be
deployed on-premise. Cloud users are oered a PaaS and SaaS based cloud
solution where all of the phases in the BPM lifecycle are fully supported
by the platform. The public variant of the cloud platform is hosted in the
datacenter of Fujitsu.
4.5.5 Barium Live
Barium Live is an public SaaS based BPM solution with two tools:
1. An online design tool in which multiple users can collaborate. Multiple
users can work simultaneously on the same business process. The
design tools works with a versioning system. When a user changes a
process, the process can be checked in again and a new version of the
process is stored in the cloud. Users can browse through the dierent
versions of the process and can merge dierent versions of the process.
2. A process enactment tool. Process users can logon to the Barium
Live software and get an overview of the tasks they have to perform.
Barium does not integrate with information systems, instead it can
only be used as a workow engine for managing human-based tasks.
4.6 Conclusion
In this chapter, we reported on the current state of the art with respect to
BPM in the cloud. Reduction of upfront investments and scalability have
been identied as the benets of BPM solutions in the cloud. Security of
data is the main issue for not placing sensitive data in the cloud. Placement
of activities that are non-computation-intensive in the cloud might also lead
to high costs when BPM is placed in the cloud. As a solution, an approach
has been presented in which data is securely stored on-premise and only
computation-intensive activities are placed in the cloud.
Oering BPM as social software has been identied as a promising approach,
in which improvement of communication and collection of knowledge have
been identied as its main benets.
We have introduced four available commercial cloud-based BPM products.
In addition, we introduced a table with an overview of existing BPM cloud
solutions. Unfortunately, there is not a lot of information about how these
38
products handle security issues to oer a secure environment to the cloud
user.
There are currently no open-source public cloud-based BPM solutions avail-
able. There are some companies that oer open-source BPM solutions, such
as Intalio BPMS, but these products can only be used as a private cloud
solution. Since private clouds are only available within the borders of an
organization, less strict security measures are necessary when compared to
deployment in a public cloud.
39
Chapter 5
Research directions
We identify research directions by further discussing the results of the liter-
ature study.
5.1 BPEL engine deployment in the cloud
The deployment of a BPEL engine in the cloud according to one of the three
service models is discussed in [6]. Several problems can occur when such an
engine is moved to the cloud. Their research is mainly based on literature
and small experiments. Based on [6], it would be interesting to take an open-
source BPEL engine and deploy it onto one of the service models and extend
it in a way that each of the identied security issues is solved, presenting
a solid solution to cloud users who want to benet from secure workow
engines in the cloud.
A possible approach might be to investigate the structure of an open-source
BPEL engine and identify the modules that need to be updated in order
to make it suitable for the cloud. Extensions need to be developed to solve
security issues. The next step would be to deploy the system to the cloud and
perform advanced security tests to demonstrate that the system is secure.
5.2 Social BPM
Social BPM is one of the trends in BPM development, as mentioned in [28].
By oering BPM as a social solution, not only process designers but also
process users can be involved in the whole BPM lifecycle. The benets of
social BPM have been investigated in [28]. Social BPM might inuence
the BPM lifecycle. For example, collection of data for each of the phases
40
might be optimized since social BPM oers tools for exchanging information.
These inuences can be researched in more detail. In addition, concerns
such as authorisation, stability, versioning, auditing and security need to be
investigated, in order to guarantee consistency.
5.3 Process decomposition
A distribution model in which business process activities can be placed on-
premise or in the cloud has been proposed in [22]. Four patterns were
identied for the distribution of a process engine, activities and data. How-
ever, we can generalise this distribution and identify a fth pattern, in which
process engines, activities and data are deployed on both the cloud and the
end-user side. This solution has two potential benets:
1. A process engine regulates both control ow and data ow. An activ-
ity receives data from the process engine and after the execution of the
activity, the data that is produced by the activity is passed back to the
process engine. Now consider that a sequence of activities is placed in
the cloud, while the process engine is deployed at the end-user side.
Each activity uses the output data of the previous activity as input
data. The data is not directly passed from activity to activity, but is
sent back to the process engine rst. An example of this situation is
shown in Figure 5.1. Since data transfer is one of the billing factors
of cloud computing, these situations can become expensive when large
amounts of data are transferred from activity to activity. To avoid this
problem, a process engine can be added to the cloud, which regulates
the control ow and data ow of the activities placed in the cloud.
When a sequence of activities is placed in the cloud, data is now regu-
lated by the process engine in the cloud, which reduces the amount of
data that needs to be transferred between on-premise and the cloud.
2. When the cloud is unavailable, users can run the business process
completely on-premise until the cloud is available again.
In order to run a business process on two process engines, the process has to
be split up into two individual processes. It would be convenient for BPM
users to take a business process and an activity distribution list, which can
be transformed automatically into two individual business processes, one
for the cloud and one for the end-user process engine. The communication
between both systems can be described by using a choreography description
language.
The transformation described above is schematically represented in Figure
5.2. In addition, the distribution list can be created automatically according
41
a2
a1
a3
a4
Process
Engine
input
On-Premise Cloud
output
input
output
output
input
input
output
Figure 5.1: Example of data intensive process problems with distributed
activities
to optimal distribution formulae, presented in [22].
Monitoring of the business process is now more complicated, since the busi-
ness process has been split up in two new business processes. As a solution, a
business processes monitor tool can be developed for monitoring the original
business process, by combining the monitored details of both the individual
processes.
A possible approach to handle process decomposition is to identify the struc-
ture and the semantics of the processes. When the control dependencies and
data dependencies are known, the consequences of moving certain activities
to either the end-user side or the cloud side can be investigated.
When the consequences of distributing activities are known, a model trans-
formation can be created in which a business process and a list with markings
is used for creating two individual business processes, one for the cloud and
one for the end-user side. In addition, a choreography description can be
generated to describe the communication between both the business pro-
cesses.
42
Orchestration
Choreography
Orchestration Orchestration
End-User Process a1
a3
a5
a6
Cloud Process
a2 a4
communication
Process
a1
a2
a3
a4
a5
a6
Distribution list
Activity On-premise Cloud
a1 X
a2 X
a3 X
a4 X
a5 X
a6 X
T
r
a
n
s
f
o
r
m
a
t
i
o
n
Figure 5.2: Schematic representation of separating business processes
43
Chapter 6
Conclusion
In this report, we investigated combination of BPM and cloud computing.
We discussed both cloud computing and BPM and gave an overview of
literature that discusses their combination.
BPM has been introduced by identifying the four phases of the BPM lifecy-
cle. In addition, three standard languages for designing and implementing
business processes have been briey discussed. We explained the structure
of a BPMS and identied concepts within BPM that are relevant for this
work.
We explained cloud computing by giving an overview of the three service
models, and the specic benets and drawbacks of each of these service mod-
els. Four cloud types were identied, and products of three cloud providers,
namely Amazon, Google and Microsoft, have been introduced.
We discussed the most relevant combinations of BPM and cloud computing
found in the literature. The deployment of an orchestrated application onto
each of the cloud service models was discussed. The distribution of data and
activities within a process were also discussed by looking into four deploy-
ment patterns. We also investigated the impacts of oering BPM software
as social software and gave an overview of existing cloud products.
We identied three possible research directions. The rst direction is the de-
ployment of a workow engine in the cloud, where several security measures
need to be taken to oer a business process platform as a service. The sec-
ond direction is to investigate the impact of oering BPM as social service.
In the third direction, we introduced a new distribution pattern in which
two process engines were applied, one in the cloud and one on-premise. A
business process can be annotated with a distribution schema and the busi-
ness process can be transformed automatically to a business process for the
cloud-based process engine and the on-premise process engine.
44
It was quite dicult to nd scientic papers about the combination of cloud
computing and BPM. We looked into several journals and browsed through
proceedings of conferences about cloud and computing and BPM, such as
CLOUD, BPM and CLOSER. Papers about the subject are scarce, but
several conference websites ask for submission of papers about this subject,
which indicates that the subject is still interesting for further research.
45
Bibliography
[1] A. Alves, A. Arkin, S. Askary, B. Bloch, F. Curbera, Y. Goland,
N. Kartha, Sterling, D. Konig, V. Mehta, S. Thatte, D. van der Rijn,
P. Yendluri, and A. Yiu. Web Services Business Process Execution
Language Version 2.0. OASIS Committee, 2007.
[2] Amazon. Amazon Elastic Compute Cloud (Amazon EC2). http://
aws.amazon.com/ec2/, Dec. 2011.
[3] Amazon. Amazon Simple Queue Service (Amazon SQS). http://aws.
amazon.com/sqs/, Dec. 2011.
[4] Amazon. Amazon Simple Storage Service (Amazon S3). http://aws.
amazon.com/s3/, Dec. 2011.
[5] Amazon. Amazon SimpleDB. http://aws.amazon.com/simpledb/,
Dec. 2011.
[6] T. Anstett, F. Leymann, R. Mietzner, and S. Strauch. Towards bpel
in the cloud: Exploiting dierent delivery models for the execution of
business processes. In Proceedings of the 2009 Congress on Services - I,
pages 670677, Washington, DC, USA, 2009. IEEE Computer Society.
[7] Appian. BPM in the Cloud Its Time, Its Safe... Its Smart, Oct. 2011.
[8] Appian. Cloud, Mobile and Social BPM, Mar. 2011.
[9] M. Armbrust, A. Fox, R. Grith, A. D. Joseph, R. H. Katz, A. Kon-
winski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia.
Above the clouds: A berkeley view of cloud computing. Technical Re-
port UCB/EECS-2009-28, EECS Department, University of California,
Berkeley, Feb 2009.
[10] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Bur-
rows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed
storage system for structured data. In In proceedings of the 7th confer-
ence on usenix symposium on operating systems design and implemen-
tation - volume 7, pages 205218, 2006.
46
[11] D. Chappell. Introducing the Windows Azure Platform. http:
//www.davidchappell.com/writing/white_papers/Introducing_
the_Windows_Azure_Platform,_v1.4--Chappell.pdf, Oct. 2010.
[12] D. Chappell. Introducing Windows Azure. http://www.
davidchappell.com/writing/white_papers/Introducing_
Windows_Azure,_v1.3--Chappell.pdf, Oct. 2010.
[13] J. Dean, S. Ghemawat, and G. Inc. Mapreduce: simplied data pro-
cessing on large clusters. In In OSDI04: Proceedings of the 6th confer-
ence on Symposium on Operating Systems Design and Implementation.
USENIX Association, 2004.
[14] E. Deelman, G. Singh, M. Livny, B. Berriman, and J. Good. The cost
of doing science on the cloud: the montage example. In Proceedings
of the 2008 ACM/IEEE conference on Supercomputing, SC 08, pages
50:150:12, Piscataway, NJ, USA, 2008. IEEE Press.
[15] fujitsu. Interstage Cloud BPM, 2009.
[16] K. Gibbs. Google app engine campre one transcript. http:
//code.google.com/intl/nl/appengine/articles/cf1-text.html,
Apr. 2008.
[17] Google. The administration console. http://code.google.com/intl/
nl/appengine/docs/adminconsole/index.html, Dec. 2011.
[18] Google. Gql reference. http://code.google.com/intl/nl/
appengine/docs/python/datastore/gqlreference.html, Dec. 2011.
[19] Google. What is google app engine? http://code.google.com/intl/
nl/appengine/docs/whatisgoogleappengine.html, Dec. 2011.
[20] O. M. Group. BPMN 2.0 by Example Version 1.0 (non-normative).
http://www.omg.org/spec/BPMN/2.0/examples/PDF, Jan. 2002.
[21] O. M. Group. Business Process Model and Notation (BPMN) Version
2.0. http://www.omg.org/spec/BPMN/2.0/PDF, Jan. 2011.
[22] Y.-B. Han, J.-Y. Sun, G.-L. Wang, and H.-F. Li. A cloud-based bpm
architecture with user-end distribution of non-compute-intensive activ-
ities and sensitive data. J. Comput. Sci. Technol., 25(6):11571167,
2010.
[23] H. Jin, S. Ibrahim, T. Bell, W. Gao, D. Huang, and S. Wu. Cloud types
and services. In B. Furht and A. Escalante, editors, Handbook of Cloud
Computing, pages 335355. Springer US, 2010.
[24] N. Kavantzas, D. Burdett, G. Ritzinger, T. Fletcher, Y. Lafon, and
C. Barreto. Web Services Choreography Description Language Version
47
1.0. World Wide Web Consortium, Candidate Recommendation CR-
ws-cdl-10-20051109, 2005.
[25] P. Mell and T. Grance. The NIST Denition of Cloud Computing.
National Institute of Standards and Technology, 53(6):50, 2009.
[26] Oracle. Platform-as-a-Service Private Cloud with Oracle Fusion Mid-
dleware, Oct. 2009.
[27] M. P. Papazoglou. Web Services - Principles and Technology. Prentice
Hall, 2008.
[28] R. Schmidt and S. Nurcan. Bpm and social software. In Business
Process Management Workshops, pages 649658, 2008.
[29] K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Dis-
tributed File System, 2010.
[30] M. Weske. Business Process Management: Concepts, Languages, Ar-
chitectures. Springer, 2007.
[31] S. White. Introduction to BPMN. http://www.bpmn.org/Documents/
Introduction%20to%20BPMN.pdf, 2004.
[32] Q. Zhang, L. Cheng, and R. Boutaba. Cloud computing: state-of-the-
art and research challenges. Journal of Internet Services and Applica-
tions, 1:718, 2010. 10.1007/s13174-010-0007-6.
48

Anda mungkin juga menyukai