Dynamically Resilient Systems
Introduction
In the future, important social and
economic services will be delivered
to users in a personalised way by pervasive, network-enabled
information systems. The architectures of such systems will be open to
the addition and modification of components, and will have to be able
to assimilate change in a predictable way without dysfunction. But how
resilient will they be when the environment, components or
infrastructure change, in a crisis (the “surge response”) or even
through normal evolution? Some technologies already exist to help us
build resilient systems, but they fail to provide a means of ensuring
resilience in this volatile environment because they require fixed
(static) levels of redundancy to deal with specific failures identified
during design. This collaboration will investigate and develop new
embryonic technologies needed to build systems which are free to evolve
dynamically, but remain predictably resilient.
Approach
System architects should be able to
design and validate open, dynamic
component-based systems that achieve predictable dynamic resilience
through run-time architecture adaptation governed by resilience
policies and triggered by trustworthy metadata.
Fundamental advances are needed in several areas in order to realise
this vision. We are working on the following points:
Dynamic Resilience Mechanisms (design time): are architectural
patterns that use run-time information to maintain resilience through
adaptation, e.g. by dynamically composing a satisfactory service from
lower-specification components. DRMs are generic (not
application-specific) but are realised in application-specific designs
as resilience policies. In our scenario, the pattern allowing dynamic
selection and parallel composition of services to maintain availability
is an example of such a mechanism. So far, relatively few DRMs have
been identified on an ad hoc basis, and we have no systematic ways of
describing and reasoning about such mechanisms during design.
Predictable Resilience by Policies
(run-time): The architect must
be able to define application-specific resilience policies that
implement dynamic resilience mechanisms. Policies should be capable of
being analysed, again formally, in advance of deployment in order to
confirm that they will achieve the resilience properties required by
the application. This in turn implies that the system architecture and
the resilience policy have to be expressed sufficiently formally to
give confidence in the outcome of analyses about whether a particular
adaptation is viable.
Trustworthy Resilience Metadata:
Resilience policies, executed at
runtime in an open and dynamic system, require appropriate metadata -
information about the running system (its components, infrastructure
and environment). For predictable dynamic resilience, we require
metadata conveying functional information
(e.g. pre/postconditions, represented by logical formulae or informal
descriptions) and non-functional
information (e.g.
availability, represented by structured
values) relevant to resilience. Metadata is permanently updated.
Service-Oriented
Architecture
supporting Reasoning and
Adaptation Services:
Architectures supporting dynamic resilience must include computation,
reasoning and adaptation services that are strong enough to work over
the metadata needed to implement the adaptation policies. These must be
backed up with other services to perform component searches and enact
adaptation with minimal disruption as described by the resilience
policy.
Applications
Papers
- G. Di Marzo Serugendo, J. Fitzgerald, "Designing and Controlling Trustworthy
Self-Organising Systems", Perada Magazine, to appear, April
2009 (pdf)
- G. Di Marzo Serugendo, J. Fitzgerald, A. Romanovsky,
N. Guelfi, "MetaSelf
- A Framework for Designing and Controlling Self-Adaptive and
Self-Organising Systems", BBKCS-08-08, Technical
Report, School of
Computer Science and Information Systems, Birkbeck College, London, UK,
December 2008. (pdf)
- G. Di Marzo Serugendo, J. Fitzgerald, A. Romanovsky,
N. Guelfi, "A
Generic Framework for the Engineering of Self-Adaptive and
Self-Organising Systems", CS-TR-1018, Technical Report, School
of
Computing Science, University of Newcastle, Newcastle, UK,
April 2007. Also currently submitted. (pdf)
- G. Di Marzo Serugendo, J. Fitzgerald, A. Romanovsky, N.
Guelfi. "A Metadata-Based
Architectural Model for Dynamically Resilient Systems",
Proceedings of the 2007 ACM Symposium on Applied Computing (SAC),
Seoul, Corea, March 11-15, 2007, pp. 566-573, ACM 2007 (pdf)
- G. Di Marzo Serugendo, J. Fitzgerald, A. Romanovsky, N.
Guelfi, "Dependable
Self-Organising Software Architectures - An Approach for Self-Managing
Systems", BBKCS-06-05, Technical Report, School of Computer
Science and Information Systems, Birkbeck College, London, UK, May
2006. (pdf)
Partners
- University of Newcastle
- University of Luxembourg
- Birkbeck College
G. Di Marzo Serugendo
Feb 2009