5.6 Experimental Set Up and Results
5.6.2 Experimental Results
5.6.2.1 Benchmark Program Results
To compare and analyze our results with that of MaxMin-M and HEART, we have mea- sured ARat and NPow for different values of utilization factors (U F). Detailed analysis
of our experimental results is presented below.
Experiment 1-Effect on Acceptance Ratio (ARat): In this experiment, we have varied UF between 0.5 and 1.0. We may observe from Figure 5.2a that at lower UF upto about 0.6, MaxMin-M, HEART and HEALERS exhibit similar performance efficiencies.
However, HEALERS and HEART progressively outperform MaxMin-M as UF increases beyond 0.6. This phenomenon may be attributed to the non-migrative execution policy of MaxMin-M within time-slices. At lower UF values (below 0.6), the probability that MaxMin-M can successfully allocate entire execution shares of every task onto a single processing core, is very high. However, at higher UF values, the possibility of allocating every task onto cores without causing any migration becomes lower. Therefore, MaxMin- M shows poorer performance compared to HEALERS and HEART. On the other hand, both HEART and HEALERS use similar task allocation strategies. Hence, they may be seen to achieve similar results for this experiment. In particular, ARat reduces from 100% to 21%, 100% to 26% and 100% to 26% for MaxMin-M, HEART and HEALERS, respectively.
(a) Effect on Acceptance Ratio ARat (b)Effect on Normalized Power Consumption NPow Figure 5.2: Result Comparison for Benchmark programs with varying U F
(m = 4 and n = 30)
Experiment 2- Effect on Normalized Power Consumption (NPow): As the value of ARat for MaxMin-M is significantly lower than that of HEART and HEALERS,
5.6 Experimental Set Up and Results
only those task-sets which have been successfully completed by all the algorithms have been considered in this experiment. We may observe from Figure 5.2b, that NPow is directly proportional to the UF of the system. This is because residual capacity in the system decreases with an increase in UF of the task-set, thereby reducing the scope for lowering core operating frequencies. This leads to higher power consumption with an increase in UF. As discussed earlier, MaxMin-M sequentially allocates tasks to their most favoured processing cores in the order of their EDdif f values. In comparison, HEART always runs the processing cores on next available frequency in the frequency set. Finally, HEALERS tries to directly allocate tasks to their most preferred processing cores in the order of increasing execution requirements and run individual cores on a combination of two frequencies (bfoptc and dfopte) within a time-slice. This allows HEALERS to perform slightly better than MaxMin-M and HEART in most scenarios where the system has some spare capacity, so that most tasks get assigned to cores where their energy efficiencies are comparatively high. In particular, improvements in energy savings for HEALERS over MaxMin-M and HEART varies from 5.81% to 16.92% and 3.77% to 6.89% with variation in U F values from 1.0 to 0.5, respectively.
5.6.2.2 Synthetic Task Set Results
We have measured and compared the ARat, NPow and CSCvalues of HEALERS, HEART and MaxMin-M for different values of the number of tasks (n) and Utiliza- tion Factors (U F) with constant number of processing cores (m) of 4. Detailed analysis of our experimental results is discussed below.
Experiment 3-Effect on Acceptance Ratio (ARat): In this experiment, we have varied the number of tasks (n) between 10 and 50, whenU F is kept constant at 0.9. We may observe from Figure 5.3a that there is a significant difference in the ARat values of HEALERS, HEART and MaxMin-M at U F = 0.9. Also, we can observe that increase in the number of tasks results in increased acceptance ratio for both the algorithms. As utilization factor and number of processing cores remain constant, an increase in the number of tasks leads to reduced individual task weights, which leads to less number of
migrating tasks in the system. Hence, both algorithms are able to perform better with increase in tasks. As stated earlier, HEALERS and HEART uses similar task-allocation strategy. Hence, they have similar ARat values for this experiment. In particular, ac- ceptance ratios ARat were found to be 85%, 87%, 90%, 94% and 96% for HEALERS (and HEART) and 25%, 32%, 37%, 46% and 57% for MaxMin-M.
(a) Effect on Acceptance Ratio ARat (m=4 andU F = 0.9)
(b) Effect on Normalized Power Consump- tion NPow(n=40 andm=4)
(c) Effect on Cost of Context SwitchCSC (m=4 and U F = 0.3)
Figure 5.3: Result Comparison for Synthetic tasks
Experiment 4-Effect of the critical frequency: In this experiment, we have varied the U F values between 0.4 and 0.1, when the number of tasks n is kept constant at 30.
We may observe from Figure 5.3b that theNPow values decreases for all the algorithms with reduction in UF values. As stated earlier (in Experiment 2), the algorithms are able to achieve higher savings at lower UF values because of the availability of more residual capacities. Further, we may observe that improvements in energy savings for HEALERS
5.6 Experimental Set Up and Results
over HEART increases as U F is reduced from 0.4 to 0.1. This is due to the fact that HEART only employs DVFS to reduce energy consumption and it is oblivious of the static power dissipated in the system. In the absence of DPM, as the required operating frequency becomes lower compared to the critical frequency (which is 0.41 in our case), the relative static power wastage in the system gets higher. Hence, HEALERS is able to perform better when the optimal operating frequency is lower than critical frequency.
On the other hand, MaxMin-M also deploys a DVFS-cum-DPM based heuristic strategy and hence, its is able to perform better than HEART for small values of UF. However, HEALERS outperforms even MaxMin-M with it’s finely tuned heuristic approach. In particular, improvements in energy savings for HEALERS over MaxMin-M and HEART varies from 22.22% to 42.85% and 12.5% to 55.55% with variation in U F values from 0.4 to 0.1, respectively.
Experiment 5- Cost of Context Switch (CSC): In our experiments, we have as- sumed the delay corresponding to a single context switch to be 5.24 µs [29], which is the actual average context switch overhead on a 24-core Intel Xeon L7455 system un- der typical workloads [29]. We computed the total number of context switches in the system over the entire simulation duration. Then, we obtained the total delay due to context switches by multiplying the delay caused by a single context switch (5.24µs[29]) with total number of context switches. Next, we calculated the average context switch overhead (in µs) per core per time slot for the HEALERS, HEART and MaxMin-M algorithms. As observed from Figure 5.3c, HEALERS and HEART incur similar con- text switches but MaxMin-M incurs slightly fewer context switches compared to other algorithms. This is because MaxMin-M uses a non-migrative scheduling strategy within time-slices, whereas HEALERS and HEART are fully-migrative strategies. HigherARat values as achieved by HEALERS and HEART is essentially effected by appropriately switching task executions on cores (using preemptions and migrations) so that tasks can be scheduled in an efficient manner in the system. To include the cost of context switches into the overall power consumption, the operating frequencies of the individual
cores were adjusted accordingly. For example from Figure 5.3c, we observe that for 30 tasks, HEALERS suffers a context switch overhead of ∼ 0.869µs while Max-Min-M suffers a context switch overhead of only∼0.752µs. So, HEALERS incurs an extra over- head of 0.117µsper core per task per time slot with respect to MaxMin-M. Given a time slot size of 1ms, HEALERS will therefore be able to complete as much work in 8548ms as MaxMin-M will complete in (1ms/0.117µs=) 8547ms. To overcome this additional context switching overhead, HEALERS must execute at 8548/8547 times the operating frequency of MaxMin-M. As all our experiments have been conducted assuming only 11 available discrete frequency levels (refer Table 5.2), this additional overhead in terms of calculated frequency very rarely translates into an increase in the discrete frequency level of a core. Hence, the cost of context switching did not have any significant effect on power consumption in our experiments.