Back to index
Distributed Snapshots: Determining Global States of Distributed Systems
K.M. Chandy (UT Austin) and Leslie Lamport (SRI)
One-line summary:
Elegant formal characterization of distributed computation, which
supports a proof that the proposed snapshot algorithm captures something
that can meaningfully be called "global state" of a DS. Also shows how
this algorithm can be used to detect stable properties, e.g. deadlock.
Overview/Main Points
Important definition/assumption (my paraphrase): the only ordering that
can be imposed on a distributed computation is that resulting from
treating send/receive of messages as events tht modify the state of a
single process or comm. channel. Any permutation of the events that
respects this partial ordering results in the system performing "the
same" distributed computation. Read the paper for the formal defs.
Global state detection algorithm (any process may initiate at will):
- p sends marker m along each channel
c that it communicates on (or multicast, etc.)
- When I receive m on channel c:
- If I have not recorded my state, then record state,
mark the state of c as the empty sequence, and
forward marker m.
- Otherwise, record state of c as sequence of messages
received on c since I recorded my state.
- Simple to prove that algorithm terminates in finite time, if
communication graph is strongly connected and channels have
finite latency.
Properties of the recorded global state:
- It does not necessarily correspond to any "real" global state of
the system (i.e. one that ever happened at a given global time)
- However, it does correspond to a possible global state
that is consistent with the ordering imposed by message sends and
receives.
- Formally: If the state recorded is S*, the sequence of
distributed computations done by the system is seq, and
the "true" initial and
final states of the system are Si and Sf, then:
- S* is reachable from Si, and Sf is reachable from S*;
- There exists a computation seq* which is a
permutation of seq,
- Either S*=Si or Si<S* in seq* (< means
"occurs earlier than"),
- Either Sf=S* or S*<Sf in seq*.
- Proof proceeds by distinguishing prerecording from
postrecording events in each process, and showing that
seq* can be constructed from seq without changing
the "ordering-ness" of subsequences of seq.
Stable property detection:
- Let y(S) be a function that returns
true iff global state S exhibits some stable property.
- Note that
just because y(S) is false at the beginning of the state-detection
algorithm, it may not be false at the termination of the
algorithm.
- Algorithm: Record some global state S*; then y(S*) will return
true iff the stable property holds, since
- S* is reachable from Si (shown in proof),
- Sf is reachable from S* (shown in proof),
- y(S) implies y(S') for all S' reachable from S
(definition of a stable property of a DS).
Relevance
A particular formal characterizaton of distributed computation, an
algorithm for recording global state, proof that the recorded state is
meaningful, and proof that stable property detection can be done this
way. Formal characterizations always make me feel better.
Flaws
Ordering criteria used as assumptions (i.e. based only on message
sends/receives) may not be appropriate for all DS's.
Back to index