I have followed the norms and guidelines given in the institute's ethical code of conduct. As far as we know, it has not been submitted elsewhere for the award of the degree.
Challenges
Real-time systems are implemented on platforms consisting of a limited number of processing elements, memory, network bandwidth, etc. Nowadays, modern real-time systems are based on heterogeneous multi-core platforms, which helps them efficiently meet the diverse and high computing demands of applications.
Motivation for this dissertation
Proposed Framework
Contributions
- EAFBFS: An Energy Aware Frame Based Fair Scheduler
- DPFair Scheduling with Slowdown and Suspension
- A Cluster-Oriented Scheduling Technique for Heterogeneous Multi-
- A Low Overhead Scheduler for Real-Time Periodic Tasks on Het-
- An Energy-Aware Scheduler for Heterogeneous Multi-core real-
- A Temperature-Aware Real-Time Semi-partitioned Scheduler
Experimental results show that our proposed DPFair-SS scheduling technique exhibits significant energy savings compared to the state-of-the-art [57] in situations where the system has low workloads over a fairly long period. Experimental studies show that our proposed scheduling mechanism is able to schedule a significantly larger number of task sets compared to the state-of-the-art [108].
Organization of the Thesis
The Application Layer
- A Real-time Task Model
Execution time(ei) is the time it takes for the processor to complete the computing power of a task without interruption. Slack time of Laxity is the maximum time a task can be delayed after it has been triggered to be completed within the deadline: di−ei.
A Real-time Scheduler
Static and dynamic task system: In a static task system, the set of tasks that is executed on the platform is fully defined before it starts executing the set of tasks. A set of tasks is said to be schedulable if there is at least one algorithm that can produce a feasible schedule.
Processing Platform
In a job-preserving scheduling algorithm, the processor is never kept idle when there is a task waiting to be executed on the processor. Therefore, all processors can execute all tasks, but the speed at which they are executed and their worst-case execution times vary depending on the processor they are executing on.
A Classification of Real-time Scheduling Approaches
Thus, some tasks cannot be executed on some processors in the platform, while their execution speeds (and their worst-case execution times) may differ on the other processors. Dynamic priority scheduling: The distinction between static priority and dynamic priority scheduling is based on the priority management policy adopted by a priority-driven scheduler.
A brief survey of scheduling algorithms
- Partitioning Strategies
- Traditional Real-time Scheduling Strategies
- Rate-based Resource Allocation Strategies
- Server-based Allocation
- Liu and Layland Style Allocation
- Fluid-flow Allocation (Proportional Share Scheduling) . 27
- Temperature-Aware Scheduling strategies
In [17], the authors addressed the problem of task-to-core allocation on heterogeneous multi-core platforms so that the overall energy consumption of the system is minimized. However, with more and more energy-constrained embedded devices running interactive and QoS-sensitive applications, such as streaming media, games, etc., the need for energy-aware proportional fair scheduling algorithms is rapidly becoming more important.
Summary
The scheduling strategy in the system should not only be able to allocate and schedule tasks on the available cores, but it should do so in such a way that energy consumption is minimized, making it a challenging problem make. In the following sections, we present the working principle of EAFBFS and DPFair-SS algorithms, which can suggest solutions for such problems.
Energy Aware Frame Based Fair Scheduling (EAFBFS)
Specifications
- System Model
- Power Model
Symbol Description n Number of tasks. si Start time of the task. ei Execution requirement of the task pi Duration of the task. rei Remaining execution requirement of Ti rpi Remaining period ofTi. wti Weight of the task shri Part of the task. U Utilization factor in the system f rcritical Normalized critical frequency. fmax Maximum value of normalized operating frequency at the frequency level. f rTi Minimum normalized frequency sufficient to complete Ti on a single core in a time slice T D Set of tasks for which f rTi > f r1. m Number of cores in the system V Set of cores. f r1 Optimum operating frequency selected in a time slice f rg Global Operating Frequency selected in a time slice.
EAFBFS Scheduling Strategy
- Algorithm EAFBFS
- Frequency Allocation and Mapping (FAM)
- Scheduling within an Individual Core
Example 3: Continuing with the previous example, let's discuss the handling of the migrating and fixed tasks in the system. Therefore, the total over-allocation of the fixed tasks onVk duringT Ek−N T Sk is given.
Analysis of the Algorithm
Calculating the system level minimum operating frequency f r1 (using Equation 3.8) takes a constant amount of time. Therefore, the time complexity of the EAFBFS scheduler is the same as that of original ERFair.
Experiment and Results
- Experimental Set Up
- Performance Evaluation of EAFBFS Algorithm
- Performance Comparison with EA-DPFair Algorithm
In Figure 3.4a, we observe that as the utilization factor of the system increases from 50% to 70%, the difference in energy consumption between type 1 and type 2 systems increases. We further observe that the energy consumption in Type 1 systems is higher when the number of tasks is low (n = 32) than when it is high (n = 96).
DPFair Scheduling with Slowdown and Suspension (DPFair-SS)
- Power Specifications
- DPFair-SS Scheduling Strategy
- The Slowdown-Suspend-Schedule Function (SSS)
- Analysis of the algorithm
- An Illustrative Example
- Experimental Set Up and Results
The pseudocode of the Slowdown-Suspend-Schedule (SSS) function shown in Algorithm 6 describes the overall energy-aware scheduling strategy within the time slice of DPFair-SS. DPFair-SS: Similar to DVFS-based DP-Wrap, T1 is divided into V0 and the frequency rg for the rest of the cores becomes 0.26 (using equation 3.27).
Summary
As the number of processors and tasks in the system grows, scheduling sets of tasks using such models becomes very expensive. Some important terminologies used in later sections of the chapter are listed in Table 4.1.
Motivational Example
Cluster-Oriented Scheduling Technique (COST)
- COST: A Cluster-Oriented Scheduling Technique
- An Illustrative Example
- Analysis of the Algorithm
- Experimental Set Up and Results
- Experimental Set Up
- Experimental Results
We can observe from Figure 4.3c that increase in the number of tasks also leads to increased acceptance ratio for both the algorithms. Since the utilization factor and number of cores remain constant, an increase in the number of tasks leads to reduced individual task weights.
HETERO-SCHED: A Low-overhead Heterogeneous Multi-core Scheduler
- HETERO-SCHED Algorithm
- COMPUTE-ALLOCATION
- ASSIGN-NON-MIGRATE
- ASSIGN-MIGRATE
- COMPUTE-SCHEDULE
- Analysis of the Algorithm
- An Illustrative Example
- Experimental Set Up and Results
- Experimental Set Up
- Experimental Results
The time complexity of Algorithm 14 (COMPUTATION-SCHEDULE) is O(mnlg n) because the construction of the schedule matrix requires O(nlg n) iterations in each of the m processor cores. Therefore, the acceptance ratio decreases with increasing number of processor cores for SA-M, while it increases for the HETERO-SCHED algorithm.
Summary
System Model
The considered system consists of a set of n periodic tasks T ={T1, T2, .., Tn} to be scheduled on a set of m heterogeneous multicore cores V = {V1, V2, .., Vm} that can run on to a discrete normalized set of frequencies F = {f1, f2, .., fmax}, so that fmax represents the normalized frequency of 1, and all other frequencies lie between (ff1 . max) and 1.
Power Model
Critical frequency: It can be observed from equation (5.2) that although the dynamic power consumption of a core (Pd) has a cubic relationship with the operating frequency, lowering f increases the task execution time and can lead to higher overall energy consumption when execution times become significantly high for small values of f. Thus there exists a critical frequency (fcr) beyond which further frequency reduction actually leads to an increased net energy consumption (summation of dynamic and static energy).
Motivational Example
As a result of this effect, beyond a certain amount of frequency reduction, the waste of static power exceeds the gain obtained by dynamic voltage/frequency scaling.
HEALERS Algorithm
COMPUTE-SCHEDULE
- SCHEDULE-NON-MIGRATE
- SCHEDULE-MIGRATE
First, it creates a list L3 from L2 to keep track of the unallocated portion of Ti (lines 2 to 4). If ucj is non-zero, SCHEDULE-MIGRATE calculates the unallocated share of Ti with respect to Vj, i.e. usi,j (lines 9 to 13).
COMPUTE-EA-SCHEDULE
2 Let |Tj| denotes the set of jobs scheduled on Vj in T Sk Compute unused capacity of Vj: ucj. 6 Change shares of each non-migrating task Ti onVj: shi,j,k =shi,j,k/fcr 7 Set fopt ←fcr.
Analysis of the Algorithm
If there is enough time to complete the execution of the new task from the end of the current time period, the scheduling of the new task is postponed until the start of the next time period. Otherwise, the system suspends execution from the current time, recalculates new time segments taking into account the period of Ti, and then prepares a new schedule for the tasks.
An Illustrative Example
The Gantt chart representation of the schedule matrix SM1[7×4] (including migration tasks) for the time slotT S1, is shown in Figure 5.1b. Energy-Aware Scheduling: As we can observe from the schedule matrix SM1 (in Figure 5.1a), V1 and V2 do not have any spare capacity in the current time slot.
Experimental Set Up and Results
Experimental Set Up
To create task sets with a specific UF, the randomly generated usage values are scaled accordingly. The following three metrics were used to compare the performance of our proposed algorithm to MaxMin-M.
Experimental Results
- Benchmark Program Results
- Synthetic Task Set Results
We can see from Figure 5.3a that there is a significant difference in the ARat values for HEALERS, HEART and MaxMin-M at U F = 0.9. We calculated the total number of context switches in the system over the entire duration of the simulation.
Summary
System Model
Each instance of Ti has an execution requirement ei to be completed within periodpi and also. At any given instant, rei andrpi indicates the remaining execution requirement and remaining period of the current instance of Ti.
Thermal Model
Γl Temperature limit or limit value at the system level shrirem Remaining fractions Ti andT Sk (En. 6.11). The steady-state temperature of a task represents the core temperature reached when the task is continuously running on core a.
Motivational Example
The TARTS Algorithm
Function TARTS()
At the beginning of the kth time slice T Sk, TARTS determines the magnitude |T Sk| of the time period that followed. Then the task list T L is sorted in non-increasing order of task share values.
Function Task Schedule()
Function Find Mapping()
However, M[j] can be delayed only if the remaining shares shrremM[j] of M[j] is less than the remaining time in. Reserve slot allocation: If no task in the ready queue meets the above condition, coreVj is left idle (thus allowing it to cool down) for the current time slot, i.e.
An Illustrative Example
With dynamic selection of frame sizes, TARTS is able to achieve a performance almost equivalent to TARTS with fixed frame size ( g = 1), as shown in the experimental results.
Analysis of the Algorithm
In the TARTS function, the initial calculation for the time slice size |G| (using Equation 6.6) and finding the slices for all tasks (using Equation 6.7) for the time slice takes O(n) time. Therefore, the time complexity of the TARTS() function can be considered as O(m) per time slice.
Experimental Set Up and Results
Experimental Set Up
For a given utilization factor (U), task weights wti are generated from a normal distribution with a clear mean µwt = 0.3 and standard deviation σwt = 0.2. To create task sets with a specific ATemp, the randomly generated stable temperature values are appropriately scaled.
Experimental Results
We can observe from Figure 6.3 that the ARat values decrease for both algorithms as the system utilization factor increases. We can observe from Figure 6.7 that increasing the ATemp values of the task sets results in decreasing the ARat values for both algorithms.
Summary
Therefore, this thesis delved into the design of various energy- and temperature-aware scheduling strategies for such real-time multicore platforms. Over the years, the industry has seen a significant shift in the nature of processing platforms in real-time embedded systems.
Future Works
In addition, the interprocessor message transfer time can vary significantly depending on the type and structure of the interconnect network between processor elements. To meet the requirements of the aforementioned distributed systems, the research presented here needs to be adapted accordingly.
Pictorial representation of the scheduling framework
Temporal Characteristics of real-time task T i
Motivational Example
For the first subset, the ratio of the summation of weights of the tasks with respect to the system's use. We can observe from Figure 4.5a that the acceptance ratio for both the algorithms decreases with increase in utilization factor of the system.
Effect of varying utilization factor on power consumption (n =
Effect of varying number of cores on power consumption (n =
Effect of Skewness on Normalized Power Consumption
Result Comparison: EAFBFS vs EA-DPFair
Task Allocation for Example
Deadline Partitioning & Cluster Formation for Example
Task Schedule for Example
COST: Experimental Results
Example
Experimental Results
An example to illustrate our proposed algorithm HEALERS