Finally, the effectiveness of the proposed approach in real-life applications is investigated via an image processing application. Note that the numbers indicated on the logic gates and on the input and output nodes of the logic gates have already been discussed in Figure 2.
Approximate Computing
Fault resilience refers to an application's ability to accept output despite some of its underlying computations being performed using approximate software and/or hardware. 1.3, the fault tolerance of such applications is attributed to several factors, including noisy and redundant input data, absence of a unique golden output, limited perception of human senses, algorithmic functions, etc.
Approximate Adders
1.6, an N-bit ESA has two primary design parameters: (i) Segment size (k), which represents the maximum length of carrier spread; and (ii) Overlapping bits (l), which represent the minimum number of bits used in carry prediction, where 1 ≤ k < N and 0 ≤ l < k. In addition to the primary design parameters andl, the anN-bit ESA also has three secondary design parameters: (i) Output bits per sub-adder (except the least significant sub-adder) contributing to the final sum(s); (ii) The size of the least significant subadder (h); and (iii) Total number of subadders (n), where r = k−l, h = N− (k −l) ⌈Nk−−lk⌉ and n = ⌈Nk−−kl⌉+ 1.
Quality Metrics
For an input vector (or set of inputs), ER is the probability that the output will be incorrect as:. iii) MED is a composite quality metric that captures ED and ER information. MSE is one such quality measure where ED is more important compared to ER.
Thesis Motivations
Therefore, there is a need for analytical modeling of approximate adders. ii) Among all existing conventional adders, RCA is the simplest. In most of the work, researchers/designers study the feasibility of approximate adders in multimedia.
Thesis Contributions
Contribution in AFAs
Because different applications have different levels of error resilience, the effectiveness of approximate adders varies from application to application. In addition to the delay and power benefits, approximate adders can also be used to address other technology/design related issues such as yield loss [1].
Contribution in ESAs
- Analytical Modeling
- Optimization
- Accuracy Enhancement
In addition to designing ApproxADDs, we also provide detailed analysis and analytical modeling to evaluate the accuracy, latency, power and area of ApproxADDs. Therefore, for a given accuracy, the a priori selection of optimal configurations that provide minimum delay, power, and/or area is a challenging decision for designers.
Application of Approximate Adders
- Cryptography Applications
- Yield Enhancement
- MAC Unit
We propose analytical models to estimate the functional yield and parametric yield of approximate arithmetic circuits. To examine the effectiveness of approximate adders for designing other approximate arithmetic operations, we design an approximate multiplier using approximate adders.
Thesis Organization
We evaluate the performance of the proposed ApproxADDs by comparing them with existing AFA-based approximate adders. In this chapter, we first discuss the design philosophy and design approach of approximate adders.
Design Approach
State-of-the-Art AFAs
Gate Level AFAs
- Simplified Full Adder [1]
- Lower-part-OR Adder [2]
Assuming that the inputs are independent and uniformly distributed, Table 2.2 also summarizes the ER of the simplified functions. Assuming that inputs are independent and uniformly distributed, Table 2.3 also summarizes the ER of SFA and LOA.
Transistor Level AFAs
- Approximate Mirror Adders [3]
- Approximate XOR/XNOR-based Adders [4]
As shown in Table 2.4, AMA#2 requires 16 transistors and has 2 errors in S, while Couti is correct for all input combinations. As shown in Table 2.5, AXA#2 has 4 errors in S while Cout is correct for all input combinations.
Error-Tolerant Adder 1 [5]
On the other hand, for an approximate sub-adder, a special mechanism is used where no carry is generated and propagated at any bit position. This special mechanism is described as follows. i) Check each input bit (Ai and Bi) from left to right. ii).
State-of-the-Art ESAs
Accuracy configurable adders: Depending on the requirements of the applications, ESAs are designed for a targeted accuracy. 2.8, Error Detection Logic (EDL) detects errors and Error Correction Logic (ECL) corrects them based on the requirements of the applications.
Research Directions
Therefore, the next research direction is to investigate the applicability/feasibility of approximate adders for state-of-the-art applications to maximize their benefits. f). Consequently, the other research direction is to use approximate adders to efficiently design other approximate arithmetic operations.
Summary
Delay Estimation
The effort delay further consists of two components: (i) Logical effort (g), which captures the effect of logical complexity; and (ii) Electrical input (h), which captures the effect of external load driven by the logic gate. 3.3), and the electrical effort is given by the ratio of the load capacitance (CL) driven by the logic gate and the input capacitance of the logic gate (Cingate) as:.
Power Estimation
Since dynamic power consumption is a strong function of the logic gate area (a), Eq. 3.9) is only valid for template ports. When determining the normalized dynamic energy consumption using Eqn. 3.11), it is necessary to calculate the switching activity factor at each node of the circuit.
Area Estimation
Assuming that the inputs are uniformly distributed (ie, the P1 of the input nodes is 0.5), we first calculate the P1 of the nodes using Table 3.1.
Proposed AFAs
Delay, Power and Area Estimation
We first calculate the design parameters such as α,g, h, pd, pp, a, dandpnm inv for all the logic gates used in conventional FA and the proposed AFAs. Using Table 3.4, we then calculate the delay (D), normalized dynamic power consumption (Pnm inv) and area (A) of conventional FA and the proposed AFAs by adding the delay (d), normalized dynamic power consumption (pnm inv) and area (a) of individual logic gates in that circuit.
ApproxADD
- Delay, Power and Area Estimation
- ED and ER Estimation
- Simulation Setup
- Results and Discussion
Hence, ED of an ApproxADD can be given by the weighted sum of input combinations Ai.Bi.Ai−1 and Ai.Bi.Ai−1as:. The analytical results and simulation results of ApproxADD as a function of N for normalized delay, power, area and PDAP are shown in Figs.
Improving ED and ER
ApproxADDv 1
- Delay, Power and Area Estimation
- ED and ER Estimation
- Results and Discussion
Here, it should be noted that the MRED of ApproxADDv1 decreases withk, while its normalized delay, power, area, and PDAP increase with mek. It means that ApproxADDv1 offers improvement in MRED at the cost of delay, power, area and PDAP.
ApproxADDv 2
- Delay, Power and Area Estimation
- ED and ER Estimation
- Results and Discussion
The proposed EDC logic can be adapted to handle such cases, but it imposes hardware complexity. However, the proposed EDC logic complements output at (i+ 2)thbit position, regardless of whether the output at (i+2)thbit position is correct.
Performance Analysis
Effect of Adder Architectures
Therefore, the adder architecture used for higher order k-bits shows a significant difference in delay, power and area. Furthermore, the rate at which the normalized delay, power, and area change also depends on the higher-order bit adder architecture.
Comparison
Table 3.7 shows that the proposed approach yields the greatest improvement in delay for RCA architecture and in power, area and PDAP for KS architecture.
Multimedia Applications
It can be seen from Table 3.9 that for k ≤ 8, values of IQA metrics are not acceptable (especially the value of SSIM). Our simulation results of IQA metrics after performing DCT and IDCT using 24-bit ApproxADDv2 for the above images are tabulated in Table 3.10.
Summary
Accuracy Estimation
- ER Estimation
- ED Estimation
- MED and MSE Estimation
Furthermore, accuracies of the proposed and existing analytical models of ER for different values of N are tabulated in table 4.3. Furthermore, the analytical models of ER discussed in are for specific configurations, while the proposed analytical model is generalized.
Delay, Power and Area Estimation
- Estimation Method
- Delay Estimation
- Area Estimation
- Power Estimation
- Model Validation
Similarly, we derive analytical models to estimate the delay of an ESA implemented using CIA and KSA architectures. Similarly, we derive analytical models to estimate the surface area of an ESA implemented using the CIA and KSA architectures.
Important Observations
We then calculate the accuracy of the proposed analytical models of deceleration, force and area according to Eqn. Table 4.8 shows that the analytical results of the delay, power and area agree well with the simulation results.
Optimization
Objective Functions and Constraints
Based on the combinations of objective functions (delay, force and/or area) and constraints (ED, ER, MED and/or MSE), various cases of optimization are now possible. However, the proposed approach can also be adapted/extended for the other optimization cases.
Optimization Framework
Results and Discussion
Here, the left y-axis shows the minimum delay, power, and area, and the right y-axis shows the corresponding optimal configurations. Therefore, the minimum delay search results tabulated in Table 4.11 are moderately valid for ESA implemented using SA and KA architectures, and the minimum area search results shown in Table 4.10 are moderately valid for ESA implemented using BKA architecture. a) ESA designed using ASK architecture provides minimum delay; (b) ESA designed using CIA architecture provides minimum power; and (c) ESA designed using the RCA architecture provides minimal footprint.
Accuracy Enhancement
- Proposed Modifications
- Results and Discussion
- Delay and Power Improvements
- Multimedia Applications
Accordingly, compared to the original ESAs: (i) modified ESAs provide a higher PR for a given delay and power; or (ii) For a given PR, custom ESAs provide lower latency and power. 4.16(h) shows the absolute difference images generated using original 8-bit ESAs and modified ESAs as a function of kforl =k/2.
Summary
- Conventional SHA-1
- Approximate SHA-1
- Maximum Approximation
- Results and Discussion
Therefore, compared to the original ESAs: (i) For a given accuracy, the modified ESAs offer lower latency, power, and area; or (ii) For a given delay, power, and area, modified ESAs provide higher accuracy. To evaluate the effectiveness of the proposed approach in real-life applications, we processed the Lena image using the original ESA and the modified ESA.
Yield Enhancement
Yield Models
Considering f(D) as a normalized distribution function of chips in defect densities, Murphy [120] predicts the functional yield as:. Further, since approximate adders are designed for a targeted application-specified threshold, γ = 1. Consequently, the functional yield in the approximate computation paradigm can be given by: . e−nAD+γnADe−nAD)f(D)dD (5.5).
Results and Discussion
Our results of the effective yield improvement for the proposed 16-bit ApproxADDv1 and ApproxADDv2 (discussed in Chapter 3) for different values of ork are shown in Fig. 5.9 shown, the functional and parametric yield improvements due to decrease in silicon area and critical path delay are extracted from Fig.
MAC Unit
- Hybrid Redundant Adder
- Proposed Approach
- ApproxMAC Unit
- Results and Discussion
5.11, the ApproxMAC unit utilizes a dedicated approximate hybrid redundant matrix multiplier to perform the multiplication operation. In addition, the total capacity, i.e. PDAP of ApproxMAC units, much better than binary and hybrid redundant MAC units.
Summary
We evaluated the effectiveness of the proposed approach by comparing ApproxADDv1 and ApproxADDv2 with existing AFA-based approximate adders. We evaluated the effectiveness of the proposed approach in real applications by processing Lena image using original ESAs and modified ESAs.
Future Aspects
Lombardi, “Design of Majority Logic (ML) Based Approximate Full Adders,” in IEEE International Symposium on Circuits and Systems (ISCAS), May 2018, p. ” in IEEE International Symposium on Circuits and Systems (ISCAS), May 2018, p.
Image processing results: (a) Original Lena image; (b) Noisy Lena image; and abso-
Block diagram of processing block of SHA-1
Block diagram of elementary block of processing block
Testing approximate SHA-1 and finding N app max
Threshold testing that identifies acceptable (but faulty) chips [16]
Application-specified threshold: (a) Fail small or fail rare; and (b) Fail small and fail
Defect size distribution [17]
Distribution of CPDs under process parameter variations
Yield improvements for a 16-bit RCA: (a) Functional yield improvement due to de-
Yield improvements for 16-bit ApproxADDv1 and ApproxADDv2
Block diagram of the proposed hybrid redundant ApproxMAC unit
Primary and secondary design parameters of an N -bit ESA
ED max and ER of a 32-bit ESA as a function of k and l
Probability (in %) of the length of carry propagation to be less than (or at least equal
Examples of gate level design simplifications
Truth table and design metrics of gate level AFAs
Truth table and design metrics of transistor level AFAs proposed in [3]
Truth table and design metrics of transistor level AFAs proposed in [4]
Estimated D, P nm inv , A and P DAP of conventional FA (Fig. 3.2) and the proposed
Design parameters of the logic gates used in FA (Fig. 3.2) and the proposed AFAs
Probability of carry-lifetimes in N -bit conventional adder for different values of N
Design parameters of the logic gates used in EDC logic (Fig. 3.8)