Problem: Not All Networks are Directly Connected
3.2 Switched Ethernet
3.2.3 Spanning Tree Algorithm
The preceding strategy works just fine until the network has a loop in it, in which case it fails in a horrible way—frames potentially get forwarded forever. This is easy to see in the example depicted inFigure 3.10, where switches S1, S4, and S6 form a loop.
Figure 3.10.: Switched Ethernet with loops.
Note that we are now making the shift from calling the each forwarding device a bridge (connecting seg- ments that might reach multiple other devices) to instead calling them L2 switches (connecting point-to- point links that reach just one other device). To keep the example managable, we include just three hosts.
In practice, switches typically have 16, 24, or 48 ports, meaning they are able to connect to that many hosts (and other swiches).
3.2. Switched Ethernet 119
In our example switched network, suppose that a packet enters switch S4 from Host C and that the desti- nation address is one not yet in any switches’s forwarding table: S4 sends a copy of the packet out its two other ports: to switches S1 and S6. Switch S6 forwards the packet onto S1 (and meanwhile, S1 forwards the packet onto S6), both of which in turn forward their packets back to S4. Switch S4 still doesn’t have this destination in its table, so it forwards the packet out its two other ports. There is nothing to stop this cycle from repeating endlessly, with packets looping in both directions among S1, S4, and S6.
Why would a switched Ethernet (or extended LAN) come to have a loop in it? One possibility is that the network is managed by more than one administrator, for example, because it spans multiple departments in an organization. In such a setting, it is possible that no single person knows the entire configuration of the network, meaning that a switch that closes a loop might be added without anyone knowing. A second, more likely scenario is that loops are built into the network on purpose—to provide redundancy in case of failure.
After all, a network with no loops needs only one link failure to become split into two separate partitions.
Whatever the cause, switches must be able to correctly handle loops. This problem is addressed by having the switches run a distributedspanning treealgorithm. If you think of the network as being represented by a graph that possibly has loops (cycles), then a spanning tree is a subgraph of this graph that covers (spans) all the vertices but contains no cycles. That is, a spanning tree keeps all of the vertices of the original graph but throws out some of the edges. For example,Figure 3.11shows a cyclic graph on the left and one of possibly many spanning trees on the right.
Figure 3.11.: Example of (a) a cyclic graph; (b) a corresponding spanning tree.
The idea of a spanning tree is simple enough: It’s a subset of the actual network topology that has no loops and that reaches all the devices in the network. The hard part is how all of the switches coordinate their decisions to arrive at a single view of the spanning tree. After all, one topology is typically able to be covered by multiple spanning trees. The answer lies in the spanning tree protocol, which we’ll describe now.
The spanning tree algorithm, which was developed by Radia Perlman, then at the Digital Equipment Cor- poration, is a protocol used by a set of switches to agree upon a spanning tree for a particular network. (The IEEE 802.1 specification is based on this algorithm.) In practice, this means that each switch decides the ports over which it is and is not willing to forward frames. In a sense, it is by removing ports from the topology that the network is reduced to an acyclic tree. It is even possible that an entire switch will not participate in forwarding frames, which seems kind of strange at first glance. The algorithm is dynamic, however, meaning that the switches are always prepared to reconfigure themselves into a new spanning tree should some switch fail, and so those unused ports and switches provide the redundant capacity needed to recover from failures.
120 Chapter 3. Internetworking
Computer Networks: A Systems Approach, Release Version 6.1
The main idea of the spanning tree is for the switches to select the ports over which they will forward frames.
The algorithm selects ports as follows. Each switch has a unique identifier; for our purposes, we use the labels S1, S2, S3, and so on. The algorithm first elects the switch with the smallest ID as the root of the spanning tree; exactly how this election takes place is described below. The root switch always forwards frames out over all of its ports. Next, each switch computes the shortest path to the root and notes which of its ports is on this path. This port is also selected as the switch’s preferred path to the root. Finally, to account for the possibility there could be another switch connected to its ports, the switch elect a single designatedswitch that will be responsible for forwarding frames toward the root. Each designated switch is the one that is closest to the root. If two or more switches are equally close to the root, then the switches’
identifiers are used to break ties, and the smallest ID wins. Of course, each switch might be connected to more than one other switch, so it participates in the election of a designated switch for each such port. In effect, this means that each switch decides if it is the designated switch relative to each of its ports. The switch forwards frames over those ports for which it is the designated switch.
Figure 3.12.: Spanning tree with some ports not selected.
Figure 3.12shows the spanning tree that corresponds to the network shown inFigure 3.10. In this example, S1 is the root, since it has the smallest ID. Notice that S3 and S5 are connected to each other, but S5 is the designated switch since it is closer to the root. Similarly, S5 and S7 are connected to each other, but in this case S5 is the designated switch since it has the smaller ID; both are an equal distance from S1.
While it is possible for a human to look at the network given inFigure 3.10and to compute the spanning tree given in theFigure 3.12according to the rules given above, the switches do not have the luxury of being able to see the topology of the entire network, let alone peek inside other switches to see their ID. Instead, they have to exchange configuration messages with each other and then decide whether or not they are the root or a designated switch based on these messages.
Specifically, the configuration messages contain three pieces of information:
1. The ID for the switch that is sending the message.
3.2. Switched Ethernet 121
2. The ID for what the sending switch believes to be the root switch.
3. The distance, measured in hops, from the sending switch to the root switch.
Each switch records the current best configuration message it has seen on each of its ports (“best” is de- fined below), including both messages it has received from other switches and messages that it has itself transmitted.
Initially, each switch thinks it is the root, and so it sends a configuration message out on each of its ports identifying itself as the root and giving a distance to the root of 0. Upon receiving a configuration message over a particular port, the switch checks to see if that new message is better than the current best configura- tion message recorded for that port. The new configuration message is consideredbetterthan the currently recorded information if any of the following is true:
• It identifies a root with a smaller ID.
• It identifies a root with an equal ID but with a shorter distance.
• The root ID and distance are equal, but the sending switch has a smaller ID
If the new message is better than the currently recorded information, the switch discards the old information and saves the new information. However, it first adds 1 to the distance-to-root field since the switch is one hop farther away from the root than the switch that sent the message.
When a switch receives a configuration message indicating that it is not the root—that is, a message from a switch with a smaller ID—the switch stops generating configuration messages on its own and instead only forwards configuration messages from other switches, after first adding 1 to the distance field. Like- wise, when a switch receives a configuration message that indicates it is not the designated switch for that port—that is, a message from a switch that is closer to the root or equally far from the root but with a smaller ID—the switch stops sending configuration messages over that port. Thus, when the system stabilizes, only the root switch is still generating configuration messages, and the other switches are forwarding these mes- sages only over ports for which they are the designated switch. At this point, a spanning tree has been built, and all the switches are in agreement on which ports are in use for the spanning tree. Only those ports may be used for forwarding data packets.
Let’s see how this works with an example. Consider what would happen inFigure 3.12if the power had just been restored to a campus, so that all the switches boot at about the same time. All the switches would start off by claiming to be the root. We denote a configuration message from node X in which it claims to be distance d from root node Y as (Y,d,X). Focusing on the activity at S3, a sequence of events would unfold as follows:
1. S3 receives (S2, 0, S2).
2. Since 2 < 3, S3 accepts S2 as root.
3. S3 adds one to the distance advertised by S2 (0) and thus sends (S2, 1, S3) toward S5.
4. Meanwhile, S2 accepts S1 as root because it has the lower ID, and it sends (S1, 1, S2) toward S3.
5. S5 accepts S1 as root and sends (S1, 1, S5) toward S3.
6. S3 accepts S1 as root, and it notes that both S2 and S5 are closer to the root than it is, but S2 has the smaller id, so it remains on S3’s path to the root.
122 Chapter 3. Internetworking
Computer Networks: A Systems Approach, Release Version 6.1
This leaves S3 with active ports as shown inFigure 3.12. Note that Hosts A an B are not able to commu- nication over the shortest path (via S5) because frames have to “flow up the tree and back down,” but that’s the price you pay to avoid loops.
Even after the system has stabilized, the root switch continues to send configuration messages periodically, and the other switches continue to forward these messages as just described. Should a particular switch fail, the downstream switches will not receive these configuration messages, and after waiting a specified period of time they will once again claim to be the root, and the algorithm will kick in again to elect a new root and new designated switches.
One important thing to notice is that although the algorithm is able to reconfigure the spanning tree whenever a switch fails, it is not able to forward frames over alternative paths for the sake of routing around a congested switch.