• Tidak ada hasil yang ditemukan

Let us try to solve the problem which was presented in Section 5.2. According to HEALERS (Algorithm 15), we first use deadline partitioning to compute the current time-slice. Since, all tasks are ready for execution, the first time-slice T S1 = [0,10).

5.5 An Illustrative Example

(a) Schedule for non-migrating tasks)

(b) Schedule for all tasks (c) Final energy-aware schedule for all tasks

Figure 5.1: An example to illustrate our proposed algorithm HEALERS

In order to compute the schedule matrixSM1 atT S1, HEALERS calls Algorithm 16.

First, it creates the lists L1 and L2 and initializes them to ∅. Then, it computes the required shares for each task Ti ∈T atT S1. The computed shares are as follows:

sh[7×4] shi,1,1 shi,2,1 shi,3,1 shi,4,1

T1 5 13 8 12

T2 8 7 10 11

T3 11 8 4 9

T4 9 9 8 1

T5 9 11 11 14

T6 9 6 1 8

T7 12 4 7 2

Now, Algorithm 16 sortsL1 based on computed shares innon-decreasing order, i.e., The sorted listL1 at time-sliceT S1 is (hi, j, shi,j,1i): h4,4,1i,{h6,3,1i,h7,4,2i,h3,3,4i, h7,2,4i, h1,1,5i, h6,2,6i, h2,2,7i, h7,3,7i, h1,3,8i, h2,1,8i, h3,2,8i, h4,3,8i, h6,4,8i, h3,4,9i, h4,1,9i, h4,2,9i, h5,1,9i, h6,1,9i, h2,3,10i}, h2,4,11i, h3,1,11i, h5,2,11i, h5,3,11i, h1,4,12i, h7,1,12i, h1,2,13i}. Then, Algorithm 16 computes the schedule for non-migrating tasks.

Scheduling Non-migrating Tasks: Algorithm 17 extracts the first element from the list L1, i.e., h4,4,1i (taskT4 requires a share of 1 unit on V4). The taskT4 can be fully allocated on V4 and hence, the schedule matrix SM1[4][4] is updated ash0,1i. Since, T4 has been completely scheduled on V4, all entries of T4, i.e. h4,3,8i, h4,1,9i and h4,2,9i have been removed from L1. Now, the above process is repeated by extracting the first

element fromL1, i.e.,h6,3,1i. Since,V3 can fully accommodateT6, it has been scheduled onV3 withSM1[6][3] =h0,1i. Similarly,T7 is scheduled (according toh7,4,2i) onV4,T3 is scheduled (according to h3,3,4i) on V3, T1 is scheduled (according to h1,1,5i) onV1 andT2 is scheduled (according toh2,2,7i) onV2. Now,h5,1,9ibecomes the first element of the list L1. According to this, if T5 is allocated on V1, it requires a share of 9 units.

However, V1 does not have the capacity to accommodate T1 and hence, the element h5,1,9i has been inserted into L2. Similarly, the other three elements corresponding to the task T5 has been moved to L2, i.e., h5,2,11i, h5,3,11i and h5,4,14i. Since, the list L1 becomes empty, the execution moves from Algorithm 17 to Algorithm 18. The schedule matrix SM1[7×4] for non-migrating tasks at T S1, is as follows:

SM1[7×4] V1 V2 V3 V4

T1 h0,5i 0 0 0

T2 0 h0,7i 0 0

T3 0 0 h0,4i 0

T4 0 0 0 h0,1i

T5 0 0 0 0

T6 0 0 h0,1i 0

T7 0 0 0 h1,3i

Scheduling of Migrating Tasks: It may be observed thatT5could not be allocated in the earlier phase and therefore it must be partitioned into multiple chunks and scheduled on more than one core using the SCHEDULE-MIGRATE algorithm (Algorithm 18). It first moves all entries corresponding to T5 from list L2 to L3. Then, it extracts the first element from L3, i.e. h5,1,9i, and computes the unused capacity of V1. Since, there is a residual spare capacity uc1 of 5 on V1, T5 has been partially allocated on V1, SM1[5][1] =h5,10i. The unallocated share of T5 with respect to V1 us5,1 becomes 9−5

= 4.

Now, Algorithm 18 extracts the next element h5,2,11i fromL3 and computes unal- located shares of T5 with respect to V2 (i.e. us5,2 = 4×1.1/0.9 = 4.8 (approx)). Then, it checks the spare capacity of V2. Since, the spare capacity ofV2 (i.e. uc2 = 3), is not sufficient enough to accommodate the unallocated share ofτ5 (i.e. 4.8), it has been par- tially allocated on V2. That is, SM1[5][2] = h0,3i. The unallocated share of T5 become

5.5 An Illustrative Example

us5,2 = 4.8−3 = 1.8. It may be noted that T2 has already been scheduled at V2 from 0 to 7. To avoid the overlap withT5, the schedule ofT2 is updated as: SM1[2][2] =h3,10i.

Next, Algorithm 18 extracts the element h5,3,11i from L3 and checks the spare capacity of V3. Since, the spare capacity of V3 (i.e. uc3 = 5) is sufficient enough to accommodate the unallocated share of T5 (i.e. us5,3 = 1.8×1.1/1.1 = 1.8), it has been allocated on V3. That is, SM1[5][3] = h3,5i. The tasks that are already scheduled on V3 are then adjusted to avoid overlaps. The Gantt chart representation of the schedule matrixSM1[7×4](including migrating tasks) for time-sliceT S1, is depicted in Figure 5.1b.

Energy-aware scheduling: As we can observe from the schedule matrix SM1 (in Figure 5.1a),V1 and V2 do not have any spare capacity in the current time-slice. Hence, COMPUTE-EA-SCHEDULE allowsV1andV2to run at their highest available frequency and we are not able to reduce any power consumption at these cores. Hence, total normalized power consumption on V1 and V2 (refer Table 5.2): (Pd+Ps +Pon)×20

= (1327 + 714 + 276)×20 = 46.34W. On the other hand, V3 has a spare capacity of 3 time slots. This can be utilized to lower the operating frequency of V3. However, the execution window of T5 will overlap with its own execution on other cores, if the operating frequency is decreased during its execution. Hence, the spare capacity of 3 units is shared only among the non-migrating tasks T6 and T3 which together require 5 time units at frequency fmax. The required frequency (refer Table 5.2) to execute the tasks T6 and T3 is fopt =d(1 + 4)/8e = 0.68. For simplicity of explanation, we are using only dfoptein this example but the algorithm uses combination of two frequencies dfopte and bfoptc practically. The modified execution shares of T6 and T3 becomes 2 (= 2/0.68) and 6 (= 4/0.68), respectively. Therefore, normalized power consumption on V3: = (655 + 461 + 276)×8 + (1327 + 714 + 276)×2 = 15.77W. In core V4, the required frequency to execute the tasks T4 and T7 is fopt = (1 + 2)/10 = 0.33. Since, fopt < fcr (0.41 in our system) , we setfopt = 0.41. The modified execution shares ofT4 and T7 becomes 3 (= 1/0.41) and 5 (= 2/0.41), respectively. V4 is kept in suspension mode for the remaining 2 slots in the time-slice. Hence, normalized power consumption

on V4: = (267 + 289 + 276)×8 + (80µW)×2 = 6.65W. The power consumption in the system using energy oblivious strategy would have been (1327 + 714 + 276)×40

= 92.68W. Therefore, the overall percentage of fractional power saved in a system is:

P = (92.68−(46.34 + 15.77 + 6.65))/92.68×100 = 25.80%. The final schedule for the given task set for T S1 is shown in Figure 5.1c. Similarly, the schedule for subsequent time-slices can be computed.