Chapter 5
Fault and Energy Aware Scheduling on Real-time Heterogeneous Dual-cores
In the earlier chapters, we have separately considered the energy-aware and fault-tolerant design strategies for real-time safety-critical systems, and have assumed the underlying hardware computing platform to be homogeneous. However, the nature of processing platforms used in embedded systems is changing over the years. To satisfy the computa- tional demands of various applications, today, we observe an increased emphasis towards the integration of unrelated processing cores (i.e., heterogeneity) onto a single hardware platform [15,35]. For example, ARM has developed a heterogeneous processing architec- ture, called ARM big.LITTLE which has been deployed in cutting-edge mobile devices such as Samsung Galaxy Note 4, S10, etc. The big.LITTLE platform contains two types of cores, one of which is high-performance, called thebigcores, while the other is of lower performance and power-efficient, referred to as LITTLE cores. Due to the differences in the internal microarchitectures of big and little cores, the same application/task may exhibit different timing as well as power characteristics on the different cores [100, 101].
Therefore, devising combined energy-aware and fault-tolerant design strategies for such heterogeneous platforms is a challenging and computationally demanding problem.
In this chapter, we present a standby-sparing based energy-aware fault-tolerant de- sign strategy for heterogeneous systems. The chapter first describes the system model under consideration. Then, we present our proposed energy-aware fault-tolerant schedul- ing strategy to effectively handle transient processor faults. Important experimental
results which highlight the performance of the proposed methodology under various sce- narios are discussed next. Finally, the chapter concludes by presenting a case study using MiBench benchmarks to illustrate the applicability of the proposed strategy in real world scenarios.
5.1 System Model and Problem Formulation
In this section, we describe various models under consideration in detail.
5.1.1 Platform and Application Model
We consider a real-time system consisting of a set ofn independent periodic tasks (T = {T1, T2, ..., Tn}) to be executed on a heterogeneous dual-core processing platform. This system consists of a power-hungry, high-performance (big) core, and a power-efficient, relatively slow (little) core. In this work, we assume that tasks are executed in aframe- based manner [100, 117]. That is, all tasks in the system share same period, which is equal to the common deadline D. The worst case number of cycles required by a task Ti on a given core is denoted by Ci. However, Ti may take up to ei = Ci/f units of execution time to complete on that core when executed at the frequency level f. Due to the asymmetric nature of the cores, the same task may require different number of cycles and execution times on each of these cores. Therefore, each task Ti (∈ T) is characterized by a two tuple (eHPi , eLPi ), where eHPi and eLPi represent the worst case execution times ofTi on high-performance (denoted byHP) and low-power (denoted by LP) cores, respectively. It is assumed that eLPi and eHPi correspond to execution time under the maximum processing frequencies onLP and HP cores, (denoted byfmaxLP and fmaxHP), respectively.
5.1.2 Power Model
Due to the asymmetric nature of the cores, the HP and LP cores have different power consumption characteristics. The dynamic power consumption of a task Ti on any pro- cessing core is modeled asPi(f) =aif3+αi, whereaiindicates the switching capacitance, f denotes the processing frequency of the task, and αi is the frequency-independent
5.1 System Model and Problem Formulation
pr1 pr2
bk1 bk2
Time D
Core1
Core2
Figure 5.1: A standby-sparing system
power consumption [100]. Therefore, the same task may exhibit different power charac- teristics on different cores of a heterogeneous system.
Each processing core executes tasks in its high-power state and dissipates power as specified by the processing frequency and the characteristics of the executing task. In this work, we employ theDynamic Power Management (DPM) technique on both cores to minimize the energy consumption. Therefore, when a core becomes idle (that is, not executing any tasks), DPM switch off the core to the low-power state. When a core transits between the high-power state to the low-power state, a specific amount of energy and time are consumed. Therefore, the minimum processor idle time required to compensate the cost of entering a low-power state is defined as the break-even time T Ibe [66] of a processing core. In this work, we assume that the cost of entering a low- power state is negligible and so, T Ibe is assumed to be zero. Let PidleLP and PidleHP denote the power consumption of LP and HP cores at their low-power states, respectively.
The overall energy consumption within a frame is determined by aggregating the energy consumption of all cores in that frame.
5.1.3 Fault Model
In this work, we employ a standby-sparing technique in which one processing core is designated as primary and the other as the spare. Figure 5.1 depicts a standby-sparing system. Each taskTihas two versions, namely,primary copy (denoted bypri) andbackup copy (denoted bybki). bki has exactly same timing parameters as that ofpri. As per the motivation of energy-awareness from literature [100], we assign primary copies of tasks to low-power LP primary core and their backup copies to high-performance HP spare
core, respectively. Whenever a primary copy completes, fault detection mechanisms such as acceptance or sanity tests [62] are conducted to detect a transient fault. If a fault is not detected, (that is, primary completes successfully), the corresponding backup copy on the spare core is deallocated from the schedule dynamically. We assume that each task (primary copy) encounters at most one transient fault and at any point in time, system is able to handle at most k transient faults per frame.
Problem Formulation: Given a set of real-time tasks to be executed on a hetero- geneous dual-core system and a number of transient faults to be tolerated, develop an efficient scheduling strategy which
• satisfies execution and deadline constraints of all tasks,
• tolerates a specified number of faults, and
• minimizes overall energy consumption of the system.