For more up-to-date notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.
Flooding is about the simplest of all distributed algorithms. It's dumb and expensive, but easy to implement, and gives you both a broadcast mechanism and a way to build rooted spanning trees.
Here we give a fairly simple presentation of flooding roughly following Chapter 2 of AttiyaWelch. For a more formal presentation based on IOAutomata, see AsynchronousBroadcast.
1. Basic algorithm
Each process maintains a single bit of state: seen-message.
At time 0, root sets seen-message to true and sends M to all neighbors.
- Upon receiving M:
If seen-message = false,
seen-message ← true
- send M to all neighbors
- Else do nothing
- Claim
- Every process receives M after at most D time and at most |E| messages, where D is the diameter of the network and E is the set of (directed) edges in the network.
- Proof
- Message complexity
- Each process only sends M to its neighbors once, so each edge carries at most one copy of M.
- Time complexity
- By induction on d(root, v), we'll show that each v receives M for the first time no later than time d(root, v) ≤ D.
- Base case
- v = root, d(root, v) = 0, root receives message at time 0, OK.
- Induction step
Let d(root, v) = k > 0. Then v has a neighbor u such that d(root, u) = k-1. By the induction hypothesis, u receives M for the first time no later than time k-1. From the code, u then sends M to all of its neighbors, including v; M arrives at v no later than time (k-1)+1 = k.
Note that the time complexity proof also demonstrates correctness: every process receives M at least once.
As written, this is a one-shot algorithm: you can't broadcast a second message even if you wanted to. The obvious fix is for each process to remember which messages it has seen and only forward the new ones (which costs memory) and/or to add a time-to-live (TTL) field on each message that drops by one each time it is forwarded (which may cost extra messages and possibly prevents complete broadcast if the initial TTL is too small). The latter method is what was used for searching in Gnutella, an early peer-to-peer system. An interesting property of Gnutella was that since the application of flooding was to search for huge (multiple MiB) files using tiny (~100 byte) query messages, the actual bit complexity of the flooding algorithm was not especially large relative to the bit complexity of sending any file that was found.
We can optimize the algorithm slightly by not sending M back to the node it came from; this will slightly reduce the message complexity in many cases but makes the proof a sentence or two longer. (It's all a question of what you want to optimize.)
2. Adding parent pointers
To build a spanning tree, have each process remember who it first received M from:
Each process maintains a single pointer parent, initially ⊥.
At time 0, root sets parent to root and sends M to all neighbors.
- Upon receiving M from u:
If parent = ⊥
parent ← u
- send M to all neighbors
- Else do nothing
We can easily prove that this algorithm has the same termination properties as the previous one by observing that if we map parent to seen-message by the rule ⊥ → false, anything else → true, then we have the same algorithm. We would like one additional property, which is that when the algorithm quiesces (has no outstanding messages), the set of parent pointers form a rooted spanning tree. For this we use induction on time:
- Claim
- At any time during the execution of the modified flooding protocol, the following invariant holds:
If u.parent ≠ ⊥, then u.parent.parent ≠ ⊥ and following parent pointers gives a path from u to root.
If there is a message M in transit from u to v, then u.parent ≠ ⊥.
- Proof
- We have to show that any event preserves the invariant.
- Delivery event
M used to be in u.outbuf, now it's in v.inbuf, but it's still in transit and u.parent is still not ⊥.1
- Computation event
Let v receive M from u. There are two cases: if v.parent is already non-null, the only state change is that M is no longer in transit, so we don't care about u.parent any more. If v.parent is null, then
v.parent is set to u. This triggers the first case of the invariant. From the induction hypothesis we have that u.parent ≠ ⊥ and that there exists a path from u to the root. Then v.parent.parent = u.parent ≠ ⊥ and the path from v→u→root gives the path from v.
Message M is sent to all of v's neighbors. Because M is now in transit from v, we need v.parent ≠ ⊥; but we just set it to u, so we are happy.
At the end of the algorithm, the induction hypothesis shows that every process has a path to the root, i.e., that the graph represented by the parent pointers is connected. Since this graph has exactly |V|-1 edges (if we don't count the self-loop at the root), it's a tree.
Though we get a spanning tree at the end, we may not get a very good spanning tree. For example, suppose our friend the adversary picks some Hamiltonian path through the network and delivers messages along this path very quickly while delaying all other messages for the full allowed 1 time unit. Then the resulting spanning tree will have depth |V|-1, which might be much worse than D. If we want the shallowest possible spanning tree, we need to do something more sophisticated: see DistributedBreadthFirstSearch. However, we may be happy with the tree we get from simple flooding: if the message delay on each link is consistent, then it's not hard to prove that we in fact get a shortest-path tree. As a special case, flooding always produces a BFS tree in the synchronous model.
Note also that while the algorithm works in a directed graph, the parent pointers may not be very useful if links aren't two-way.
3. Termination
See AttiyaWelch Chapter 2 for further modifications that allow the processes to detect termination. In one sense each process can terminate as soon as it is done sending M to all of its neighbors, but this still requires some mechanism for clearing out the inbuf; by adding acknowledgments as described in AttiyaWelch, we can terminate with the assurance that no further messages will be received.
CategoryDistributedComputingNotes
This sort of extraneous special case is why I personally don't like the AttiyaWelch split between outbuf and inbuf, even though it makes defining the synchronous model easier. (1)