1708.07233v1.pdf (271.24 kB)
Download file

Reliability and fault-tolerance by choreographic design

Download (271.24 kB)
journal contribution
posted on 24.05.2018, 09:15 by Ian Cassar, Adrian Francalanza, Claudio Antares Mezzina, Emilio Tuosto
Distributed programs are hard to get right because they are required to be open, scalable, long-running, and tolerant to faults. In particular, the recent approaches to distributed software based on (micro- )services where different services are developed independently by disparate teams exacerbate the problem. In fact, services are meant to be composed together and run in open context where unpredictable behaviours can emerge. This makes it necessary to adopt suitable strategies for monitoring the execution and incorporate recovery and adaptation mechanisms so to make distributed programs more flexible and robust. The typical approach that is currently adopted is to embed such mechanisms in the program logic, which makes it hard to extract, compare and debug. We propose an approach that employs formal abstractions for specifying failure recovery and adaptation strategies. Although implementation agnostic, these abstractions would be amenable to algorithmic synthesis of code, monitoring and tests. We consider message-passing programs (a la Erlang, Go, or MPI) that are gaining momentum both in academia and industry. Our research agenda consists of (1) the definition of formal behavioural models encompassing failures, (2) the specification of the relevant properties of adaptation and recovery strategy, (3) the automatic generation of monitoring, recovery, and adaptation logic in target languages of interest.

Funding

∗Partially supported by EU COST IC1405 (Reversible Computation - Extending Horizons of Computing). †The research work disclosed in this publication is partially funded by the ENDEAVOUR Scholarships Scheme. “The scholarship may be part-financed by the European Union — European Social Fund”

History

Citation

Electronic Proceedings in Theoretical Computer Science, EPTCS, 2017, 254, pp. 69-80

Author affiliation

/Organisation/COLLEGE OF SCIENCE AND ENGINEERING/Department of Informatics

Source

Second International Workshop on Pre- and Post-Deployment Verification Techniques (PrePost 2017), Torino

Version

VoR (Version of Record)

Published in

Electronic Proceedings in Theoretical Computer Science

Publisher

Open Publishing Association

issn

2075-2180

Copyright date

2017

Available date

24/05/2018

Publisher version

https://arxiv.org/abs/1708.07233v1

Language

en

Usage metrics

Categories

Keywords

Exports