An Illustrative Example - HETERO-SCHED: A Low-overhead Heterogeneous Multi-core Scheduler

4.4 HETERO-SCHED: A Low-overhead Heterogeneous Multi-core Scheduler

4.4.3 An Illustrative Example

Let us consider a real-time system consisting of a set of five tasks (T = {T1, T2, ..., T5}) to be preemptively scheduled on a heterogeneous multi-core platform consisting of three processing cores (V ={V₁, V₂, V₃, V₄}). The Utilization Matrix U_[5×4] is as follows:

U_[5×4] V₁ V₂ V₃ V₄ T₁ 0.5 1.3 0.8 1.2 T2 0.8 0.7 1.0 1.1 T₃ 1.1 0.8 0.7 0.9 T₄ 1.2 0.9 0.8 0.7 T₅ 0.9 1.1 1.1 1.4

The period (as well asdeadline) of each task is as follows: p₁ = 10,p₂ = 20, p₃ = 10, p₄ = 20 and p₅ = 40. According to HETERO-SCHED (Algorithm 9), we compute first the hyper-period, H = lcm(10,20,10,20,40) = 40. Using deadline partitioning, let us find the time-slices in the interval [0,40). It may be observed from Figure 4.4a that there are four time-slices, i.e., T S = {T S₁, T S₂, T S₃, T S₄}, where T S₁ = [0,10), T S₂ = [10,20), T S₃ = [20,30) and T S₄ = [30,40).

Computation of Shares Required: Algorithm 9 invokes Algorithm 10 to compute the allocation matrixAM_k for each time-slice T S_k. First, Algorithm 10 creates the two lists L₁, L₂ and initializes them to ∅. Then, Algorithm 10 invokes Algorithm 11 to compute the shares required by each task T_i ∈T at the time-slice T S₁ whose interval is [0,10). The computed shares are as follows:

4.4 HETERO-SCHED: A Low-overhead Heterogeneous Multi-core Scheduler for Real-time Periodic Tasks

(a)Deadline Partitioning (b)Schedule for migrating tasks

sh_[5×4] sh_i,1,1 sh_i,2,1 sh_i,3,1 sh_i,4,1

T₁ 5 13 8 12

T₂ 8 7 10 11

T3 11 8 7 9

T₄ 12 9 8 7

T₅ 9 11 11 14

Now, Algorithm 11 sorts the computed shares in non-decreasing order, i.e., The sorted list L₁ at time-slice T S₁ is as follows (hi, j, sh_i,j,1i): {h1,1,5i, h2,2,7i, h3,3,7i, h4,4,7i,h1,3,8i,h2,1,8i,h3,2,8i,h4,3,8i,h3,4,9i,h4,2,9i,h5,1,9i,h2,3,10i,h2,4,11i, h3,1,11i, h5,2,11i}, h5,3,11i, h1,4,12i, h4,1,12i, h1,2,13i, h5,4,14i}. This list L₁ is returned to Algorithm 10. Then, Algorithm 10 invokes Algorithm 12 to compute the allocation matrix AM₁ for non-migrating tasks.

Allocation of Non-migrating Tasks: Algorithm 12 extracts the first element from the listL₁, i.e.,h1,1,5i(taskT₁ requires a share of 5 units on processorV₁). The taskT₁

can be fully allocated on processing core V₁ and hence, the allocation matrixAM₁[1][1]

is updated as 5 and the migration count for T₁ is set to 0, i.e., M C₁[1] = 0. Since, task T₁ has been completely allocated on the processing core V₁, Algorithm 12 deletes all entries of T₁ from list L₁, i.e., h1,3,8i, h1,4,12i and h1,2,13i have been removed from L1. Now, Algorithm 12 repeats the above process by extracting the first element from L₁, i.e., h2,2,7i. Since, processing core V₂ can fully accommodate task T₂, it has been allocated to V₂ withAM₁[2][2] being updated to 7 and the migration count forT₁ is set to 0, i.e., M C₁[2] = 0. Similarly, the task T₃ is allocated (according to h3,3,7i) on V₃ and T₄ is allocated (according to h4,4,7i) on V₄.

Now, h5,1,9i becomes the first element of list L₁. According to this, if task T₅ is allocated on processing core V1, it requires a share of 9 units. However, processing core V₁ does not have the capacity to accommodate T₁ and hence, the element h5,1,9i has been inserted into L₂. Similarly, the other three elements corresponding to task T₅ has been moved to L₂, i.e., h5,2,11i, h5,3,11i and h5,4,14i. Since, the list L₁ becomes empty, the execution moves from Algorithm 12 to Algorithm 10. The allocation matrix AM1[5×4] for non-migrating tasks at time-slice T S₁, is as follows:

AM1[5×4] V₁ V₂ V₃ V₄

T1 5 0 0 0

T₂ 0 7 0 0

T₃ 0 0 7 0

T₄ 0 0 0 7

T₅ 0 0 0 0

Allocation of Migrating Tasks: It may be observed that task T₅ is unallocated and Algorithm 10 invokes Algorithm 13 to scheduleT₅. It first moves all entries corresponding to T5 from list L2 to L3. Then, Algorithm 13 extracts the first element from L3, i.e., h5,1,9i, and computes the spare capacity of V1. Since, there is a residual spare capacity of 5 on processing coreV₁, the taskT₅ has been partially allocated on V₁,AM₁[5][1] = 5.

The unallocated share ofT₅is 9−5 = 4. This has been normalized as follows: us₅ = 4/0.9

= 4.4 (approx). The normalized value of the unallocated share for taskT₅ (us₅) and the migration count (M C₁[5]) have been updated as 4.4 and 1, respectively.

4.4 HETERO-SCHED: A Low-overhead Heterogeneous Multi-core Scheduler for Real-time Periodic Tasks

Now, Algorithm 13 extracts the next elementh5,2,11ifromL₃ and checks the spare capacity of processing core V₂. Since, the spare capacity of V₂ (i.e. 3), is not sufficient enough to accommodate the unallocated share of T₅ (i.e., 4.4∗1.1 = 4.8), it has been partially allocated onV₂. That is,AM₁[5][2] = 3. The unallocated share ofT₅is 4.8−3 = 1.8. This has been normalized as follows: us5 = 1.8/1.1 = 1.6 (approx). The normalized value of the unallocated share for task T₅ (us₅) and the migration count (M C₁[5]) have been updated as 1.6 and 2, respectively.

Next, Algorithm 13 extracts the element h5,3,11i from L₃ and checks the spare capacity of processing core V₃. Since, the spare capacity of V₃ (i.e. 3), is sufficient enough to accommodate the unallocated share of T₅ (i.e., 1.6∗1.1 = 1.76), it has been allocated on V₃. That is,AM₁[5][2] = 2. The final allocation matrixAM1[5×4] (including migrating tasks) for time-slice T S₁, is as follows:

AM1[5×4] V₁ V₂ V₃ V₄

T₁ 5 0 0 0

T₂ 0 7 0 0

T₃ 0 0 7 0

T4 0 0 0 7

T₅ 5 3 2 0

Scheduling of Migrating Tasks: Since the task allocation is feasible, Algorithm 9 invokes Algorithm 14 to construct a schedule. It may be noted that the migration count of T₅ is the highest among all tasks and hence, T₅ is scheduled first according to HETERO-SCHED guidelines. The resulting schedule consisting of T₅ is depicted in Figure 4.4b.

Scheduling of Non-migrating Tasks: It may be observed that all the remaining unscheduled tasks have the same migration count. Since ties are broken arbitrarily, let us consider the task T1. By following the allocation matrix and HETERO-SCHED guidelines, T₁ is scheduled on V₁ from time slot 5 to 10. Next,T₂ is scheduled on V₂. It may be observed that the execution of T₂ is broken into two pieces since T₅ is already scheduled on V₂. Similarly, the remaining tasks are scheduled and the final schedule for time-slice T S₁ is depicted in Figure 4.4c.

It may be noted that the size of all time-slices in T Sis same and hence, the schedule constructed for the first time-slice T S₁ can be utilized for all the remaining time-slices by only updating the time-slice boundaries.

Dalam dokumen New Approaches to Energy and Temperature Aware Scheduling Techniques for Real-time Multi-core (Halaman 136-140)