An Illustrative Example - New Approaches to Energy and Temperature Aware Scheduling Techniques

Let us try to solve the problem which was presented in Section 5.2. According to HEALERS (Algorithm 15), we first use deadline partitioning to compute the current time-slice. Since, all tasks are ready for execution, the first time-slice T S₁ = [0,10).

5.5 An Illustrative Example

(a) Schedule for non-migrating tasks)

(b) Schedule for all tasks (c) Final energy-aware schedule for all tasks

Figure 5.1: An example to illustrate our proposed algorithm HEALERS

In order to compute the schedule matrixSM1 atT S1, HEALERS calls Algorithm 16.

First, it creates the lists L₁ and L₂ and initializes them to ∅. Then, it computes the required shares for each task T_i ∈T atT S₁. The computed shares are as follows:

sh[7×4] sh_i,1,1 sh_i,2,1 sh_i,3,1 sh_i,4,1

T₁ 5 13 8 12

T₂ 8 7 10 11

T₃ 11 8 4 9

T4 9 9 8 1

T₅ 9 11 11 14

T₆ 9 6 1 8

T₇ 12 4 7 2

Now, Algorithm 16 sortsL₁ based on computed shares innon-decreasing order, i.e., The sorted listL₁ at time-sliceT S₁ is (hi, j, sh_i,j,1i): h4,4,1i,{h6,3,1i,h7,4,2i,h3,3,4i, h7,2,4i, h1,1,5i, h6,2,6i, h2,2,7i, h7,3,7i, h1,3,8i, h2,1,8i, h3,2,8i, h4,3,8i, h6,4,8i, h3,4,9i, h4,1,9i, h4,2,9i, h5,1,9i, h6,1,9i, h2,3,10i}, h2,4,11i, h3,1,11i, h5,2,11i, h5,3,11i, h1,4,12i, h7,1,12i, h1,2,13i}. Then, Algorithm 16 computes the schedule for non-migrating tasks.

Scheduling Non-migrating Tasks: Algorithm 17 extracts the first element from the list L₁, i.e., h4,4,1i (taskT₄ requires a share of 1 unit on V₄). The taskT₄ can be fully allocated on V₄ and hence, the schedule matrix SM₁[4][4] is updated ash0,1i. Since, T₄ has been completely scheduled on V₄, all entries of T₄, i.e. h4,3,8i, h4,1,9i and h4,2,9i have been removed from L₁. Now, the above process is repeated by extracting the first

element fromL₁, i.e.,h6,3,1i. Since,V₃ can fully accommodateT₆, it has been scheduled onV₃ withSM₁[6][3] =h0,1i. Similarly,T₇ is scheduled (according toh7,4,2i) onV₄,T₃ is scheduled (according to h3,3,4i) on V₃, T₁ is scheduled (according to h1,1,5i) onV₁ andT₂ is scheduled (according toh2,2,7i) onV₂. Now,h5,1,9ibecomes the first element of the list L1. According to this, if T5 is allocated on V1, it requires a share of 9 units.

However, V₁ does not have the capacity to accommodate T₁ and hence, the element h5,1,9i has been inserted into L₂. Similarly, the other three elements corresponding to the task T₅ has been moved to L₂, i.e., h5,2,11i, h5,3,11i and h5,4,14i. Since, the list L₁ becomes empty, the execution moves from Algorithm 17 to Algorithm 18. The schedule matrix SM1[7×4] for non-migrating tasks at T S₁, is as follows:

SM_1[7×4] V₁ V₂ V₃ V₄

T₁ h0,5i 0 0 0

T2 0 h0,7i 0 0

T₃ 0 0 h0,4i 0

T₄ 0 0 0 h0,1i

T₅ 0 0 0 0

T₆ 0 0 h0,1i 0

T₇ 0 0 0 h1,3i

Scheduling of Migrating Tasks: It may be observed thatT₅could not be allocated in the earlier phase and therefore it must be partitioned into multiple chunks and scheduled on more than one core using the SCHEDULE-MIGRATE algorithm (Algorithm 18). It first moves all entries corresponding to T₅ from list L₂ to L₃. Then, it extracts the first element from L₃, i.e. h5,1,9i, and computes the unused capacity of V₁. Since, there is a residual spare capacity uc₁ of 5 on V₁, T₅ has been partially allocated on V₁, SM₁[5][1] =h5,10i. The unallocated share of T₅ with respect to V₁ us_5,1 becomes 9−5

= 4.

Now, Algorithm 18 extracts the next element h5,2,11i fromL3 and computes unallocated shares of T₅ with respect to V₂ (i.e. us_5,2 = 4×1.1/0.9 = 4.8 (approx)). Then, it checks the spare capacity of V₂. Since, the spare capacity ofV₂ (i.e. uc₂ = 3), is not sufficient enough to accommodate the unallocated share ofτ₅ (i.e. 4.8), it has been partially allocated on V₂. That is, SM₁[5][2] = h0,3i. The unallocated share of T₅ become

5.5 An Illustrative Example

us_5,2 = 4.8−3 = 1.8. It may be noted that T₂ has already been scheduled at V₂ from 0 to 7. To avoid the overlap withT₅, the schedule ofT₂ is updated as: SM₁[2][2] =h3,10i.

Next, Algorithm 18 extracts the element h5,3,11i from L₃ and checks the spare capacity of V₃. Since, the spare capacity of V₃ (i.e. uc₃ = 5) is sufficient enough to accommodate the unallocated share of T₅ (i.e. us_5,3 = 1.8×1.1/1.1 = 1.8), it has been allocated on V₃. That is, SM₁[5][3] = h3,5i. The tasks that are already scheduled on V₃ are then adjusted to avoid overlaps. The Gantt chart representation of the schedule matrixSM1[7×4](including migrating tasks) for time-sliceT S₁, is depicted in Figure 5.1b.

Energy-aware scheduling: As we can observe from the schedule matrix SM₁ (in Figure 5.1a),V₁ and V₂ do not have any spare capacity in the current time-slice. Hence, COMPUTE-EA-SCHEDULE allowsV₁andV₂to run at their highest available frequency and we are not able to reduce any power consumption at these cores. Hence, total normalized power consumption on V₁ and V₂ (refer Table 5.2): (P_d+P_s +P_on)×20

= (1327 + 714 + 276)×20 = 46.34W. On the other hand, V3 has a spare capacity of 3 time slots. This can be utilized to lower the operating frequency of V3. However, the execution window of T₅ will overlap with its own execution on other cores, if the operating frequency is decreased during its execution. Hence, the spare capacity of 3 units is shared only among the non-migrating tasks T₆ and T₃ which together require 5 time units at frequency f_max. The required frequency (refer Table 5.2) to execute the tasks T₆ and T₃ is f_opt =d(1 + 4)/8e = 0.68. For simplicity of explanation, we are using only df_optein this example but the algorithm uses combination of two frequencies df_opte and bf_optc practically. The modified execution shares of T₆ and T₃ becomes 2 (= 2/0.68) and 6 (= 4/0.68), respectively. Therefore, normalized power consumption on V3: = (655 + 461 + 276)×8 + (1327 + 714 + 276)×2 = 15.77W. In core V4, the required frequency to execute the tasks T₄ and T₇ is f_opt = (1 + 2)/10 = 0.33. Since, f_opt < f_cr (0.41 in our system) , we setf_opt = 0.41. The modified execution shares ofT₄ and T₇ becomes 3 (= 1/0.41) and 5 (= 2/0.41), respectively. V₄ is kept in suspension mode for the remaining 2 slots in the time-slice. Hence, normalized power consumption

on V₄: = (267 + 289 + 276)×8 + (80µW)×2 = 6.65W. The power consumption in the system using energy oblivious strategy would have been (1327 + 714 + 276)×40

= 92.68W. Therefore, the overall percentage of fractional power saved in a system is:

P = (92.68−(46.34 + 15.77 + 6.65))/92.68×100 = 25.80%. The final schedule for the given task set for T S1 is shown in Figure 5.1c. Similarly, the schedule for subsequent time-slices can be computed.

Dalam dokumen New Approaches to Energy and Temperature Aware Scheduling Techniques for Real-time Multi-core (Halaman 156-160)