DeDiSys

Trade-offs between Improved Availability of the Distributed Systems and Service Integrity Constraints

 

Distributed systems are widespread today, ranging from applications present in daily life such as banking or health care applications to highly specialised distributed systems used in control engineering and air traffic control. The key element for achieving scalable and maintainable distributed software systems is dependability, because otherwise the complexity of distribution would leave the system uncontrollable.

Data in distributed systems is not stored at a single location, nor is data processing performed by only one computer. Such interconnected systems are far more susceptible to failures than non-distributed ones: If only one of the many computers fails, or if a single network link is down, the system as a whole may become unavailable.

The most commonly used approach to improve availability is to replicate services and data to several locations in the network, making at least one copy available in case of failure.

Different kinds of consistency are distinguished in distributed and replicated systems. Replica consistency defines the correctness of replicas; i.e. it is a measure how replicas of the same logical entity may differ from each other. Concurrency consistency is a correctness criterion for concurrent access to a particular data item (isolation). Constraint consistency defines the correctness of the system state with respect to a set of data integrity rules. 
If consistency has to be ensured all time (e.g. bank applications), even in the presence of failures, the system becomes unavailable in degraded scenarios. However, some applications (e.g. safety-critical systems) exist where consistency can be relaxed in order to achieve higher availability.
The aim of DeDiSys is to investigate the optimum between the two extremes: Can some of the data and service integrity constraints be (temporarily) relaxed to gain improved availability of the distributed system?

 

Goals of the project:

Investigation and metrics-based evaluation of the trade-off between availability and consistency.

System models for data-centric systems (Middleware and distributed objects) and service-centric systems (GRID, Peer2Peer, Web Services). Implementation of middleware extensions that facilitate:

– Replication of data and services in distributed systems.

– Flexible and controllable resilience to failures.

– Enforcement of data consistency constraints.

– Automated reconciliation of inconsistencies

– Validation, comparison, assessment: Exploring the best combination of model and technology (EJB, CORBA, Microsoft COM/COM+/.NET, JXTA