For more up-to-date notes see http://www.cs.yale.edu/homes/aspnes/classes/465/notes.pdf.
1. Old notes on synchronous BFS algorithms
These are based on LynchBook Section 4.2.
1.1. Easy case: network is undirected
Here for every edge uv there is a reverse edge vu. So we can send acknowledgments.
State of the algorithm qi consists of
parent pointer qi.parent, initially nil.
- list of children, initially empty
- active, initially false
Initiator has same algorithm as other nodes, and same state except that its parent pointer starts out holding its own ID and its active field starts out true.
Send function:
- if q.active
if q.parent <> myID
- send child(myID) to q.parent
- send recruit(myID)
Update function:
- if q.parent = nil and there is at least one recruit message in the buffer
- pick some msg recruit(x) q.parent := x q.active := true
- q.active := nil
- Running time is max(d(root, u)) + 1, since we need to wait for the wave of recruit messages to get all the way out and the last set of child responses to come back. Total messages is |E| + |V| - 1 = O(|E|).
- Proof of correctness: show by induction that after r rounds, first r levels of tree have correct parent pointers, first r-1 levels have correct child lists, r-th level is the only one active. Pretty much a standard "and then this happens" argument.
1.1.1. Application: broadcast
If we rip out the child messages and parent pointers, we get a protocol for broadcast. Formally, the states look like (msg, state) where state is one of (waiting, active, done). Initial state of initiator is (msg, active), everybody else is (null, waiting).
Send function is
- if q.active
- send q.msg to everybody
Update function is
- if q.waiting and we received msg
- q.msg := msg q.state := active
- q.state := done
To prove correctness, adapt BFS proof. Note that since we aren't sending back child acknowledgments, we don't need to assume an undirected network.
If we are broadcasting many messages, it may be more efficient to build a BFS tree first and then use a modified broadcast algorithm that only sends to children in the BFS tree (this gets the number of messages down from |E|+|V|-1 to just |V|-1). Note that we can also start a new broadcast each round without overloading the network, since the wave of messages carrying broadcast k+1 is always one step behind broadcast k.
1.2. Hard case: directed network
For a directed network, we can't send child messages back to the parent. So instead we use broadcast on every child message to spam it to the entire system (with an extra To field to identify the intended recipient). This requires substantial expansion in the state space and the message length, since we have to track every message we've ever seen to know whether to forward a new copy. However, the time cost doesn't go up by much: since both the initial propagation of the recruit messages and the subsequent child broadcasts take rounds linear in the diameter, the total is just 2*diameter in the worst case.
1.2.1. Termination testing
One problem with a simple modification of the BFS algorithm using broadcast child messages is that a node can't tell if it has any unacknowledged children out there it hasn't heard about yet. So an additional modification is to have each node broadcast a no-child message for each rejected potential parent. A node can then declare itself done precisely when it's received either child or no-child from every neighbor it sent recruit to.
This doesn't quite allow the initiator to detect when when the tree is done being built, since it may be that its own children can acknowledge it fairly quickly. But this can be handled by adding a neighbor count to the child messages (since the initiator gets spammed with all child messages just like anybody else); when the total number of child or no-child messages received equals the total neighbor count (including the initiator's own neighbor count), the initiator can conclude that all recruit messages have been processed and acknowledged.
1.2.2. Application: leader election
We deferred the question of doing LeaderElection in a directed graph. Here's a simple algorithm: treat each node as the initiator of a separate BFS protocol, where a node only participates in protocols initiated by nodes with IDs ≥ the max ID it has seen. The only node that finishes its BFS protocol wins. Time is O(diameter), message complexity is O(E*diameter). Message size is about the same as BFS---a given node is effectively only participating in one BFS protocol in each round, so there is not much blow-up from the parallel protocols. (It's still bad---concentrating cascades of child and no-child messages could easily yield messages of size O(E log V) assuming node IDs are O(log V).)