Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. A new consensus protocol

Consider the following protocol for binary RandomizedConsensus using shared memory.

The shared data consists of 2n+1 registers a[1..2n+1], all initially set to ⊥.

Each process executes the following code:

preference ← input
while true do:
    read all 2n+1 registers
    if all registers have the same value v and v≠⊥:
        decide v
    if at least n+1 of the registers have the same value v and v≠⊥:
        preference ← v
    choose r uniformly at random and set a[r] ← preference

Does this algorithm solve consensus? Why or why not?

1.1. Solution

The algorithm does not solve consensus, because it violates agreement in some executions.

Here is such a bad execution. We use two processes p₀ and p₁ with preference 0 and 1 each, and a patsy process q₀ that reads only 0's

Process p₀ writes 0 to a[1..n].
Process q₀ reads 0 from a[1..n].
Process p₁ writes 1 to a[1..n].
Process p₀ writes 0 to a[n+1..2n].
Process q₀ reads 0 from a[n+1..2n] and stops.
Process p₀ writes 0 to a[1..n] interleaved with process p₁ writing 1 to a[n+1..2n], so that each time p₀ or p₁ reads all the registers, it sees exactly n 0's and n 1's.
Process p₀ and p₁ each read all the registers, see exactly n 0's and n 1's, and keep their preferences.
Process p₀ writes 0 to a[2n+1].
Process q₀ reads 0 from a[2n+1] and decides 0.
Process p₁ writes 1 to a[n+1]. We now have a majority of 1's in a[1..n+1], so process p₁ can be run to completion by itself and will eventually write 1's everywhere and decide 1.

This is not what was asked, but if we replace the collect of all the registers with an atomic snapshot (which effectively reads all the registers simultaneously), then the algorithm works, although it might not be especially fast. After an expected O(n log n) writes (see CouponCollector), every register has a non-⊥ value, and so at any time t there is some majority value v_t. But then there is a nonzero probability that any process with preference ¬v_t writes to a location that already contains ¬v_t (or to at most n distinct locations if none contain ¬v_t), in which case each of these processes sees the majority value v_t and adopts it. Once all processes have the same preference, after O(n log n) more writes on average, all registers are equal and all processes decide; this gives termination. For agreement, observe that if some process decides after seeing all registers holding v, then even if we let the other n-1 processes write ¬v to n-1 distinct registers, the majority value is still v, and so all processes will adopt it and eventually decide v as before.

2. Version control with quorums

The day after its Subversion server catches fire, a large software company decides to implement a new version control system with replicated data stores. Using this system, a user should be able to check out (read) a document and check in (write) an update if and only if no other updates have been checked in (by anybody) since the user's last checkout. They would like to implement this system in an asynchronous message-passing system with five servers, with the guarantee that the system will continue to work as long as no more than two of the servers fail (by crashing).

Describe a mechanism for implementing this system, or show that it can't be done without additional assumptions.

2.1. Solution

Can't do it, because of FLP. Here is a protocol for consensus:

Check out the document; if its value is â¥, check in input.
Check out the document again and return its value.

This satisfies termination and validity trivially, and satisfies agreement because the first process to successfully check in its input prevents any others from doing so, either because of the restriction on check-ins, or because they see a non-null value.

3. Paxos with failure detectors

Suppose we run the basic Paxos algorithm with a failure detector, with the rule that any acceptor will reject (responding with nack and not otherwise updating their state) all messages from a proposer unless the proposer has the smallest id among all proposers that the acceptor doesn't currently suspect. Suppose also that we modify the proposer's code so that it aborts a round and tries again with a higher round number if it receives nack from or suspects a majority of acceptors. Assuming some proposer and a majority of acceptors never crash, with which of the FailureDetectors <>S, S, <>P, and P will the algorithm always eventually terminate?

3.1. Solution

If we assume that there is at least one acceptor that is not a proposer, then the strong failure detectors <>S and S are both no good, because we might suspect all proposers forever. Even if every acceptor is also a proposer, for n > 2 we are still in trouble, because we might suspect a majority of acceptors forever. So if we are going to make this work, we will need to consider <>P or P.

For <>P, we have that eventually there is some time after which: (a) all crashed processes are suspected by all processes, and (b) all non-crashed processes are not suspected by any process. Let c be the lowest-id non-crashed proposer at this time. Then the acceptors will reject any proposals from proposers that aren't c, and so when c eventually chooses a large enough round number, it will complete the algorithm by amassing a majority of acceptors (since a majority of acceptors don't crash and are not suspected by c at this point). This works as long as c doesn't crash first; but if it does, then we can wait until it is suspected by all processes an move on to a new proposer c' that now has the lowest id. Iterating this process, we eventually reach a lowest-id proposer that never crashes, and this proposer finishes the protocol.

Since <>P works, P would work as well.

(This does not mean that we need âP to make Paxos work; it's possible to build Omega from <>S, but we have to be a little more clever about it.)