Summary - Enhancement of SBST Techniques for Detection of Processor Faults

Chapter 6 Application of Fragments of SBST Programs for Online Testing

In extreme online operating conditions of the processor, intermittent faults, which are temporary in nature, may momentarily appear and disappear. These intermittent faults could damage the gate-level logic of the processor and may eventually turn into perma- nent faults. To achieve high self-test quality, i.e., high fault coverage, the self-test codes detect these intermittent faults too. If fault detection latency, which is the time gap between fault occurrence and its detection, is large for the self-test code, intermittent faults not get detected efficiently and as a result, the fault coverage will be lesser.

Along with the real-time, periodic mission tasks, a self-test task is also executed in regular intervals for the online testing of the processor. If the execution interval between these self-test tasks is large, the fault detection latency also would be longer.

Generally, large self-test codes are executed on the processor to test the complicated functionalities processor. But the execution intervals between such large test codes will be large, resulting in undesirably high fault detection latency. In other words, even if the fault coverage of large self-test codes are high, due to large intervals between their executions, many intermittent faults incur large detection latency (e.g., faults that occur just after the completion of execution of the test code).

If smaller test codes are applied, individual fault coverage would be less for these fragments. But as these code fragments are executed with smaller test periods, they effectively validate the processor for intermittent faults with lesser fault detection latency.

Also, multiple efficient small test code fragments, whose overall coverage is almost simi- lar to that of the larger test code, can replace the larger test code for the online testing

0 0.05 0.1 0.15 0.2 0.25 0.3

40 50 60 70 80 90

Avg. Test Execution Time (ms)

Fault Coverage (%)

Figure 6.1: Avg. Execution Time for Different Groups of Fragments on 100 MHz MIPS Processor Model

of the processor. In the previous chapter, we have reduced the test program with the help of compaction techniques. But this amount of compaction or reduction would not be enough in order to generate test programs that can effectively test processor for intermittent faults.

The average execution times of synthesized test codes during evolutionary test synthesis with equal fault coverage is shown in Fig. 6.1. This behavior can be observed for every set of test codes which are developed during the evolutionary test synthesis.

These test codes, which are developed during the evolutionary test synthesis, are stored exclusively for our online testing technique. The execution times of test codes with coverage between 35% and 80% are in the range (0.01, 0.014), whereas the execution times of FTPs with coverage between 81% and 95% are in the range (0.016, 0.285). The execution time increases gradually from the test codes with 35% coverage to the test codes with 80% coverage. But the change in execution time is significant from the test codes with 80% coverage to the test codes with 95% coverage.

The test codes with coverage between 75% and 85% have considerably lesser execution time and thus, lesser fault detection latency, compared with the test codes with coverage above 85%. Also, these test codes maintain adequate test quality (75% to 85%). So, a reliability analysis of the test codes in this range must be conducted to identify the optimal test codes that provide the best tradeoff between fault detection latency and coverage.

In our approach, efficient, reliable fragments of a self-test code are discovered by means of selection of test code fragments that maintains the online test quality (fault coverage) and minimizes the fault detection latency. To identify the optimal fragments, we evaluate the reliability of self-test fragments of different sizes. However, these fragments can be executed as self-test sub-tasks only if they satisfy certain scheduling crite- ria; the modified response time of each mission task and the modified overall utilization following the inclusion of these self-test sub-tasks must not exceed their corresponding limit. If these conditions are satisfied, the self-test fragments are executed as sub-tasks, scheduled with appropriate execution windows between the execution of mission tasks.

Also, the overall coverage of these fragments must be nearly equivalent to the coverage of the unfragmented full test program.

As the intermittent faults occur irregularly at the same location, the self-test codes must be regularly executed with a short test period to efficiently trace them. D. Gi- zopoulos [98] suggested that the self-test quality could be improved when the self-test tasks are executed with larger execution time and enhanced self-test utilization. But the proposed techniques in [98] will increase the self-test period to achieve maximum utilization. If the self-test period is increased, fault detection latency also would increase, and subsequently, some of the instantaneous intermittent faults may be left undetected.

If an intermittent fault occurs just after a large test period, fault detection latency will be higher, which may cause system errors. The tradeoff between test utilization and fault detection latency in [98] could be dealt only if efficient, small chunks of SBST codes are executed frequently between the mission tasks, i.e., smaller, coverage-efficient test programs are intermittently executed during a self-test period to reduce the fault detection latency.

In our approach, shorter, reliable SBST test code fragments are discovered and executed intermittently in a self-test period to immediately detect the intermittent faults with minimal fault detection latency. But smaller test codes might have lesser fault coverage, which could leave some of the intermittent faults undetected. So, the test fragment synthesis must consider both fault detection latency and test quality (fault coverage) in developing reliable fragments. These minimal code fragments are applied in appropriate execution windows between the execution of the mission tasks.

We summarize the problem as follows: smaller SBST self-test codes with smaller

fault detection latency realize rapid detection and recovery of intermittent faults. But these minimal test programs could have less reliability due to low fault coverage. Larger test programs would detect most of the faults but reliability will be lesser due to high fault detection latency. To deal with this trade-off, optimal and reliable set of fragments must be discovered with significant self-test quality (coverage) and minimal fault detection latency.

So, in this chapter, a high-reliable online fault detection model to test a low-cost, real- time embedded processor for the intermittent faults is demonstrated. The contributions of this fragmented testing approach are:

• The instruction sequences of a larger SBST code is prudently replaced with smaller, coverage-efficient, online Fragment of Test Programs (FTPs) to be executed intermittently during a test period to detect the intermittent faults with minimal fault detection latency and good test quality (fault coverage). To meet this, we evaluate the reliability of the system with respect to different fragment sizes. From the maximum permissible fragment size, minimum permissible test execution window size is also assessed.

• We demonstrate the reliability-based FTP selection using a set of 12 mission task workloads on a 100 MHz MIPS processor model.

• A fault tolerant self-test schedule is proposed to deal with the challenges in the detection and recovery for the intermittent faults.

In the next section, the basic definitions of schedulability, system reliability, and recovery models are discussed for the demonstration of the proposed synthesis of online SBST program fragments.

Dalam dokumen Enhancement of SBST Techniques for Detection of Processor Faults (Halaman 135-140)