Supporting Quality of Service in Scientific Workflows
While workflow management systems have been utilized in enterprises to support businesses for almost two decades, the use of workflows in scientific environments was fairly uncommon until recently. Nowadays, scientists use workflow systems to conduct scientific experiments, simulations, and distr...
|Online Access:||PDF Full Text|
No Tags, Be the first to tag this record!
|Summary:||While workflow management systems have been utilized in enterprises to support
businesses for almost two decades, the use of workflows in scientific environments
was fairly uncommon until recently. Nowadays, scientists use workflow systems to
conduct scientific experiments, simulations, and distributed computations. However,
most scientific workflow management systems have not been built using existing
workflow technology; rather they have been designed and developed from
scratch. Due to the lack of generality of early scientific workflow systems, many
domain-specific workflow systems have been developed. Generally speaking, those
domain-specific approaches lack common acceptance and tool support and offer
lower robustness compared to business workflow systems.
In this thesis, the use of the industry standard BPEL, a workflow language
for modeling business processes, is proposed for the modeling and the execution of
scientific workflows. Due to the widespread use of BPEL in enterprises, a number
of stable and mature software products exist. The language is expressive (Turingcomplete)
and not restricted to specific applications. BPEL is well suited for the
modeling of scientific workflows, but existing implementations of the standard lack
important features that are necessary for the execution of scientific workflows.
This work presents components that extend an existing implementation of the
BPEL standard and eliminate the identified weaknesses. The components thus provide
the technical basis for use of BPEL in academia. The particular focus is on
so-called non-functional (Quality of Service) requirements. These requirements include
scalability, reliability (fault tolerance), data security, and cost (of executing a
workflow). From a technical perspective, the workflow system must be able to interface
with the middleware systems that are commonly used by the scientific workflow
community to allow access to heterogeneous, distributed resources (especially Grid
and Cloud resources).
The major components cover exactly these requirements:
Cloud Resource Provisioner Scalability of the workflow system is achieved by
automatically adding additional (Cloud) resources to the workflow system’s
resource pool when the workflow system is heavily loaded.
Fault Tolerance Module High reliability is achieved via continuous monitoring
of workflow execution and corrective interventions, such as re-execution of a
failed workflow step or replacement of the faulty resource.
Cost Aware Data Flow Aware Scheduler The majority of scientific workflow
systems only take the performance and utilization of resources for the execution
of workflow steps into account when making scheduling decisions. The
presented workflow system goes beyond that. By defining preference values
for the weighting of costs and the anticipated workflow execution time,
workflow users may influence the resource selection process. The developed multiobjective
scheduling algorithm respects the defined weighting and makes both
efficient and advantageous decisions using a heuristic approach.
Security Extensions Because it supports various encryption, signature and authentication
mechanisms (e.g., Grid Security Infrastructure), the workflow
system guarantees data security in the transfer of workflow data.
Furthermore, this work identifies the need to equip workflow developers with
workflow modeling tools that can be used intuitively. This dissertation presents
two modeling tools that support users with different needs. The first tool, DAVO
(domain-adaptable, Visual BPEL Orchestrator), operates at a low level of abstraction
and allows users with knowledge of BPEL to use the full extent of the language.
DAVO is a software that offers extensibility and customizability for different application
domains. These features are used in the implementation of the second tool,
SimpleBPEL Composer. SimpleBPEL is aimed at users with little or no background
in computer science and allows for quick and intuitive development of BPEL workflows based on predefined components.|