that a packet belonging to one of the old flows succeeds in incrementing it - that is, an atomic write is not expected. This is because the switch CPU that polls this list only requires the list value to increment by at least 1 to conclude that there are old flows still alive.
0 20 40 60 80 100
0 20 40 60 80 100 120
Throughput (Mbit/s)
Time (s)
OldFlow NewFlow StartRU,t EndRU,t
(a) Throughput-new flow
0 20 40 60 80 100
0 20 40 60 80 100 120 140 160 180
Throughput (Mbit/s)
Time (s)
OldFlow1 OldFlow2 RUStart,t RUEnd,t
(b) Throughput-old flow
0 20 40 60 80 100
0 10 20 30 40 50 60
Throughput (Mbit/s)
Time (s)
New flow RUStart,t RUEnd,t
(c) Throughput-sf rreceives Commit OK late
Figure 6.3: Throughput during an RU
Figure 6.4: A PFC RU on a FatTree network
fromh00 toh20 and the reverse flows from h20 toh00 are required to change from the old path to the new path (Figure 6.4). Next, a new flow N ewF low is started between the same host pairs, for a duration of 120s. From Figure6.3a, it can be observed that there is an instantaneous drop in throughput at the beginning (“StartRU”) and at the end of the RU (“EndRU”), because rules are inserted and deleted, while during the RU itself, there is no change in the throughput of the new flow in spite of the additional processing. This is clear when the throughput of the new flow after the old flow ceases at 60s is compared with the throughput of the new flow after the RU ends. During the course of the RU, the switch CP polls the switch DP every 2∗Tp=30s to decide if the old flow is complete, and at the fourth poll, it finds that live f lhas not incremented. It concludes that the old flow no longer exists and completes the RU. This also illustrates immediate effectiveness of the RU (section 6.2) -OldF low and N ewF low coexist, using different versions of the same rule.
6.5.2 Experiment 2
This experiment is to check the impact of ProFlow on the throughput of flows where each packet needs to be resubmitted at sf. Here, two old flows, OldF low1 of duration 60s and OldF low2, of duration 120s, are started one after the other, using iperf, fromh00 to h20
and subsequently, an RU is started to change the path from old to new ( Figure6.4). The throughput of each flow is observed every 1s. As shown in Figure 6.3b, there is a drop in the throughput of OldF low2 during the RU, compared to OldF low1 before the RU started, as sf f and sf r (and no other affected switch) need to resubmit each packet, as each packet first matches a new rule.
6.5.3 Experiment 3
Here, sending Commit OK to s2 (sf r of N ewF low from h00 to h20) in Figure 6.4 is delayed by 20s to ensure that in spite of this, if the forward flow follows the new path, the reverse flow from h20 to h00 also follows the new path, and to measure the impact on throughput. At the end of the RU, there is an instantaneous drop in throughput;
otherwise, the throughput during and after the RU appears similar, in Figure 6.3c.
In summary, if a flow does not undergo resubmission, in a simulated setup, there is no observable drop in throughput during the RU, in spite of the additional processing.
Otherwise, there is a drop in throughput during the RU,compared to the throughput when there is no update affecting the flow.
6.5.4 Experiment 4
The goal of this experiment is to check if updating live f l for every packet is feasible, as discussed in section 6.4. 25 flows are started simultaneously from h00 to h20, of duration 60s, using iperf. If the number of flows is further increased, due to high load on the switches, flows start getting slower and the inter-packet delay increases beyondTp, causing some flows to start using new rules. An RU begins next, to change the path of the flows.
The throughput and the difference between the time of arrival of the last packet and completion of the RU are plotted in Figure6.5b. The experiment is repeated afterlive f l is set to be updated for every packet and DP is polled every Tp s. The throughput when live f l is updated for every packet (denoted as “Throughput:every”) is comparable with that when updated with a minimum delay of Tp ( denoted as “Throughput:Tp”), as shown in Figure 6.5b. Also, the difference in time from when the RU ends (denoted as
“RU time:every”) to when the last packet of any flow arrives (denoted as “Difference from last packet:every”) lies between Tp and 2∗Tp. (It may also be observed that the
(a) Time to wait after the last packet
0 10 20 30 40 50 60
1 2 3 4 5 0
15 30 45 60 75 90 105 120 135
Sum of thr oughputs (Mbit/s)
RU Time (s)Trial number
Throughput:every Throughput:Tp RU time:every RU time:Tp Difference from last packet:every Difference from last packet:Tp
(b) Updatelive f levery packet - 25 flows
Figure 6.5: Frequency of updating live f l
same quantity varies betweenTp and 4Tp when live f l is updated with a minimum delay
of Tps.) Thus updating live f l for every packet is a practical solution, as explained in section 6.4. While the version of P4 used in the implementation automatically assumes that a register operation is atomic,P416[31] supports explicitly specifying this. This may be left unspecified forlive f l, as explained in section6.4, to further improve throughput, whenP416 is used.