Delay Redundant Multi Operand Adder Using Cyclic Combinational Method

(1)

International Journal of Recent Advances in Engineering & Technology (IJRAET)

_______________________________________________________________________________________________

ISSN (Online): 2347 - 2812, Volume-2, Issue -11,12 2014 70

Delay Redundant Multi Operand Adder Using Cyclic Combinational Method

1Balusupati Durga Prasad, ²Katuru Anjaneyulu

1PG Student (M.Tech), Dept. Of ECE, KKR & KSR Institute of Technology & Sciences, Guntur

2Assistant Professor, Dept. Of ECE, KKR & KSR Institute of Technology & Sciences, Guntur

Abstract—In this paper, we propose the design of delay redundant multioperand adder using compressor trees by using fast carry resources. Our approaches strongly reduce delay and they generally present no area overhead compared to a CPA tree. Moreover, they could be defined at a high level based on an array of standard CPAs.The adder is further extended by replacing the adder with cyclic compressor . Furthermore, due to its simple structure, it is easy to design a schematic code, which allows synthesizing a compressor tree for any number of operands of any bit width. Compared to previous approaches, our design presents better performance, is easier to implement, and offers direct portability.

Keywords—Adder, Carry Propagation, Carry Propagate Adders (CPA), Multioperand Addition, Carry Save Adders.

I. INTRODUCTION

Adders are used in many aspects. It is generally recognized that most of the time required by adders is due to carry propagation, so how to reduce the propagation time is the focus on today’s techniques.

Different binary adder schemes have their own characters, such as area and energy dissipation. No such adder scheme is the best for every condition, so to choose in a specific context with specific requirement and constraint is important.

Computation operations like fast parallel multiplication using adder trees are present in many parts of a digital system or digital computer, especially in signal processing, high-speed circuits, graphics and scientific computation. Examples of such are graphic processor, digital signal processors, communication or code compression. To speed up addition is a very important part for computation.

There are many tree structure like Wallace adder tree [1], CSA tree, over turn stair tree [2] and some other kinds of adder trees are mentioned in [3]-[7]. Here Wallace tree is used as the tree structure because it is suitable for implementation

One of these resources is the carry-chain system, which is used to improve the implementation of carry propagate adders (CPAs). It mainly consists of additional specialized logic to deal with the carry signals,

and specific fast routing lines between consecutive LEs.This resource is presented in most current FPGA devices from low-cost ones to high-end families, and it accelerates the carry propagation by more than one order of magnitude compared to its implementation using general resources. Apart from the CPA implementation, many studies have demonstrated the importance of using this resource to achieve designs with better performance and/or less area requirements, and even for implementing nonarithmetic circuits.

Multioperand addition appears in many algorithms, such as multiplication, filters, SAD, and others. To achieve efficient implementations of this operation, redundant adders are extensively used. Redundant representation reduces the addition time by limiting the length of the carry-propagation chains. The most usual representations are carry-save (CS) and signed-digit (SD). A CS adder (CSA) adds three numbers using an array of Full-Adders (FAs), but without propagating the carries. In this case, the FA is usually known as a 3:2 counter. The result is a CS number, which is composed of a sum-word and a carry-word. Therefore, the CS result is obtained without any carry propagation in the time taken by only one FA.The addition of two CS numbers requires an array of 4:2 compressors, which can be implemented by two 3:2 counters.

The conversion to non redundant representation is achieved by adding the sum and carry word in a conventional CPA.

II. CARRY SAVE COMPRESSIONS TREES

Let us consider a generic compressor tree of N_op input operands with N bit width each. We also assume the same bit width for input and output operands. Thus, input operands should have previously been zero or sign extended to guarantee that no overflow occurs. A detailed analysis of the number of leading guard bits required for multioperand CS addition is provided.

Regular CS Compressor Tree Design

The classic design of a multioperand CS compressor tree attempts to reduce the number of levels in its structure.

The 3:2 counter or the 4:2 compressor are the most widely known building blocks to implement it . We

(2)

_______________________________________________________________________________________________

select a 4:2 compressor as the basic building block. The implementation of a generic CS compressor tree requires [N_op/2] -1 4:2 compressors (because each one eliminates two signals), whereas a carry-propagate tree uses Nop- 1 CPAs (since each one eliminates one signal). If we bear in mind that a 4:2 compressor uses practically double the amount of resources as CPAs both trees basically require the same area.

Figure 1 N-bit width CS 9:2 compressor tree based on a linear array

III. MULTI OPERAND ADDER

In the previous approach, specialized carry resources are only used in the design of a single 4:2 compressor, but these resources have not been considered in the design of the whole compressor tree structure. To optimize the use of the carry resources, we propose a compressor tree structure similar to the classic linear array of CSAs [24].

However, in our case, given the two output words of each adder (sum-word and carry-word), only the carry- word is connected from each CSA to the next, whereas the sum words are connected to lower levels of the array.

Fig. 1 shows an example for a 9:2 compressor tree designed using the proposed linear structure, where all lines are N bit width buses, and carry signal are correctly shifted. For the CSA, we have to distinguish between the regular inputs (A and B) and the carry input (Ci in the figure), whereas the dashed line between the carry input and output represents the fast carry resources. With the exception of the first CSA, where Ci is used to introduce an input operand, on each CSA Ci is connected to the carry output (Co) of the previous CSA, as shown in Fig.

1. Thus, the whole carry-chain is preserved from the input to the output of the compressor tree (from I0 to Cf).

First, the two regular inputs on each CSA are used to add all the input operands (Ii). When all the input operands have been introduced in the array, the partial sum-words (Si) previously generated are then added in order (i.e., the first generated partial sums are added first) as shown in Fig.1. In this way, we maximize the overlap between propagation through regular signals and carry- chains.

Regarding the area, the implementation of a generic compressor tree based on N bit width CSAs requires Nop 2 of these elements (because each CSA eliminates one input signal) [24]. Therefore, considering that a

CSA could be implemented using the same number of resources as a binary CPA (as shown below), the proposed linear array, the 4:2 compressor tree, and the binary CPA tree have approximately the same hardware cost.

In relation to the delay analysis, from a classic point of view our compressor tree has Nop -2 levels. This is much more than a classic Wallace tree structure and, thus, a longer critical path. We temporarily assume that there is no delay for the carry-chain path. Under this assumption, the carry signal connections could be eliminated from the critical path analysis and our linear array could be represented as a hypothetical tree, as shown in Fig. 2 where the carry-chain is represented in gray.

Fig.2. Time model of the proposed CS 9:2 compressor tree

IV

.

DELAY REDUNDANT CYCLIC COMPRESSOR ADDER

Full-Adder cannot be implement by Cyclic Combinational Circuit, there is an effort to implement the same in a special type of adder, generally known as COMPRESSOR .that has three outputs function as sawn by the block diagram of Fig-3 for a 4:3 compressor

Fig 3: block diagram of 4:3 compressor Compressor Architecture

A single bit full adder can be considered as a compressor/ counter. From the truth-table of Fig-4 we can see, for any input combination, the no. of logic 1 present in the input sequence can be count by the decimal equivalent of the output sequence where Cout is the MSB & S is the LSB..

(3)

_______________________________________________________________________________________________

Figure 1Reduction by rows

The basic concept is to reduce bit numbers in each column of each level. So full adder and half adder are used as (3:2) counter adder and (2:2) counter.

Fig. 2 FA and HA as (3:2) counter and (2:2) counter In Fig. 5, three nodes inside pane represent the FA’s three inputs and two nodes outside represents the FA’s carry out and sum. The half adder has two inputs; abd one sum and one carry out. Here is an example used in this work presented.

Fig. 3 cyclic compressor counter

We can use a cyclic compressor as shown in Fig. 6 to achieve fast addition. The basic module in the compressor is the counting architecture .the input selected corresponding number of ones are counted.

V. RESULTS& CONCLUSIONS

The following figures show the simulation results for the designed multi operand adders. Figures 8 and 9 give the simulation results for given circuit.

Figure 4 Top Module Implemented in DSCH

Figure 5 Simulation result for multi operand adder

Figure 6 Simulation result for active state The following figures show the simulation results for cyclic compressor design for multi operand. Figures 11 and 12 give the simulation results for given circuit.

(4)

_______________________________________________________________________________________________

Figure 10 cyclic compressor design for multioperand

Figure 11simulation of cyclic compressor

Figure 12 simulation result for cyclic compressor Time(ps) Multi Operand Cyclic Compressor

0.3 0.017 0.089

0.6 0.034 0.209

0.9 0.051 0.379

Figure 13 timing analysis of the adders Parameter Multi Operand Cyclic Compressor

Power 67.186µw 54.696µw

Delay 0.299ns 183ps

Figure 14 power and delay results

CONCLUSION

The multioperand adder with cyclic functionality is compared with the proposed adder based on the synthesis result delay and power is reduced. 8 bit input along with a carry input is implemented in dsch2 software and synthesized in the microwind software.

REFERENCES

[1] L. Dadda, ―Some Schemes for Parallel Multipliers,‖ Alta Frequenza, vol. 34, no. 5, pp.

349-356, 1965.

[2] S. Gao, D. Al-Khalili, and N. Chabini, ―FPGA Realization of HighPerformance Large Size Computational Functions: Multipliers andApplications,‖ Analog Integrated Circuits and Signal Processing,vol. 70, no. 2, pp. 165-179, Feb. 2011.

[3] K. Macpherson and R. Stewart, ―Rapid Prototyping – AreaEfficient FIR Filters for High Speed FPGA Implementation,‖ IEEE Proc.

Vision, Image and Signal Processing, vol. 153, no. 6, pp. 711-720, Dec. 2006.

[4] P. Meher, S. Chandrasekaran, and A. Amira,

―FPGA Realization of FIR Filters by Efficient and Flexible Systolization using Distributed Arithmetic,‖ IEEE Trans. Signal Processing, vol.

56, no. 7, pp. 3009-3017, July 2008.

[5] Z. Kincses, Z. Nagy, L. Orzo, P. Szolgay, and G.

Mezo,

―Implementation of a Parallel SAD Based Wavefront SensorArchitecture on FPGA,‖ Proc.

European Conf. Circuit Theory andDesign (ECCTD ’09), pp. 823-826, Aug. 2009.

[6] F. Bensaali, A. Amira, and A. Bouridane,

―Accelerating MatrixProduct on Reconfigurable Hardware for Image ProcessingApplications,‖

IEE Proc. Circuits, Devices and Systems, vol.

152,no. 3, pp. 236-246, June 2005.

[7] Y.F. Chan, M. Moallem, and W. Wang, ―Design and Implementation of Modular FPGA-Based PID Controllers,‖ IEEE Trans Industrial Electronics, vol. 54, no. 4, pp. 1898-1906, Aug.

2007.

Authors Profile:

BALUSUPATI DURGA PRASADis pursuing his Master degree M.Tech in VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS in KKR & KSR Institute of Technology & Science. He has completed his B.Tech in ECE in CHALAPATHI INSTITUTE OF ENGINEERING AND TECHNOLOGY.

K.ANJANEYULU is working as Assistant Professor in KKR & KSR Institute of Technology & Science.He has completed his Master Degree and pursuing Ph.D. His research work is aimed at Antenna Theory.

