## TABLE OF CONTENTS

1. Acknowledgement ..... 1
2. Objective ..... 2
3. Introduction ..... 2
4. Tool Used ..... 3
5. Adder structure ..... 4
6. Project Description ..... 9
7. Simulation results ..... 14
8. Conclusion ..... 14
9. Reference ..... 15
10. Appendix ..... 16

## ACKNOWLEDGEMENT

I wish to place on record my deep sense of gratitude for the instructor Dr. James Leffew for his constant guidance throughout the course.

## OBJECTIVE

The purpose of this project is design and simulation of a sixteen bit Conditional Sum Adder using four bit slice components with Look ahead carry. The Look ahead carry is incorporated in the 4 bit slices.

## INTRODUCTION

High performance microprocessors demand faster arithmetic operations. The addition of two operations is the most frequent operation in almost any microprocessor arithmetic unit. A two-operand adder is used not only when performing additions and subtractions but also employed in some more complex operations such as multiplication, division, and other functions. Propagation delay has been one of the major problems facing engineers working to implement high-speed circuits. High propagation delays in binary addition will result in a highly amplified propagation delay at the output of the circuit.

There are many ways of formulating the process of binary addition. Each different way provides different insight and thus suggests different implementations. Examples are Weinberger \& Smith's carryłook ahead (CLA) adder, Nadler's pyramid adder ,Sklansky's conditional sum adder , Bedrij's carryselect adder, and Ladner \& Fischer's prefix adder.

The following table compares the various binary addition techniques with Conventional ripple Adder ${ }^{(1)}$.

Table 1
Comparisons of Different Adders

| Type of adders | MOS <br> Transistor <br> Counter | Delay <br> ( $\Delta=$ delay <br> through one <br> logic level) | Transistor \# <br> increase <br> comparing <br> with Ripple <br> Adder <br> $(\%)$ | Speed <br> Increase <br> Comparing <br> with Ripple <br> Adder <br> $(\%)$ |
| :---: | :---: | :---: | :---: | :---: |
| Ripple-Adders | 1792 | 128 | ---- | ---- |
| Look-Ahead(CLA) | 2816 | $38(54)$ | 57 | 237 |
| Look-Ahead(BCLA) | 3230 | $16(21)$ | 80 | 700 |
| Conditional Sum | 4938 | $17(24)$ | 176 | 653 |
| Carry-Select | 3856 | 24 | 115 | 433 |
| Carry-Skip | 2407 | 28 | 34 | 357 |

*: The number in () represents more reasonable estimation, while another number outside of ()
gives the optimism estimation (low bound estimation).

## TOOL USED

MAX PLUS II (Version 10.1) from ALTERA has been used in this project. MAX+PLUS II software is a fully integrated, architecture-independent package for designing logic with Altera programmable logic devices. MAX+PLUS II offers a full spectrum of logic design capabilities: three design entry methods for hierarchical designs; floorplan editing; powerful logic synthesis; design partitioning; functional, timing, and board-level-type linked simulation; detailed timing analysis; automatic error location; and device programming and verification.

## ADDER STRUCTURE

Carry Look Ahead Adder(CLA) Technique is used to speed up carry propagation in adder complex. Hence the carries entering all the bit positions of a "parallel" adder are generated simultaneously by additional logic circuitry. This results in a constant addition delay independent of the length of the adder.


Conventional Ripple carry Adder

For the conventional ripple carry adder shown above, the Sum of the most significant stage will be valid after $2(\mathrm{~N}-1)+1$ gate delays, in which N is the number of bits. The carry-out bit will be valid after 2 N gate delays. This delay may be in addition to any delays associated with interconnections. It should be mentioned that in case one implements the circuit in a FPGA, the delays may be different from the above expression depending on how the logic has been placed in the look up tables and how it has been divided among different CLBs (Configurable Logic Block) .For instance, for a

32-bit adder, the delay would be about 63 ns if one assumes a gate delay of 1 ns . That would imply that the maximum frequency one can operate this adder would be only 16 MHz ! For fast applications, a better design is required.

The carry-look-ahead adder solves this problem by calculating the carry signals in advance, based on the input signals. It is based on the fact that a carry signal will be generated in two cases: (1) when both bits $A_{i}$ and $B_{i}$ are 1, or (2) when one of the two bits is 1 and the carry-in (carry of the previous stage) is 1 . Thus, one can write,

$$
\begin{equation*}
\text { Cout }=\mathrm{C}_{\mathrm{i}+1}=\mathrm{A}_{\mathrm{i}} \cdot \mathrm{~B}_{\mathrm{i}}+\left(\mathrm{A}_{\mathrm{i}} \text { XOR } \mathrm{B}_{\mathrm{i}}\right) \cdot \mathrm{C}_{\mathrm{i}} . \tag{1}
\end{equation*}
$$

Which can be also written as,

$$
\begin{equation*}
C_{i+1}=G_{i}+P_{i} \cdot C_{i} \tag{2}
\end{equation*}
$$

in which

$$
\begin{align*}
\mathrm{G}_{\mathrm{i}} & =\mathrm{A}_{\mathrm{i}} \cdot \mathrm{~B}_{\mathrm{i}}  \tag{3}\\
\mathrm{Pi} & =\left(\begin{array}{lll}
\mathrm{A}_{\mathrm{i}} \text { XOR } & \mathrm{B}_{\mathrm{i}}
\end{array}\right) \tag{4}
\end{align*}
$$

are called the Generate $\left(\mathrm{G}_{\mathrm{i}}\right.$ and Propagate (Pi) term.

Notice that both the Propagate and Generate terms only depend on the input bits and thus will be valid after one gate delay.Let's apply this to a 4-bit adder.
$\mathrm{C}_{1} \quad=\mathrm{G}_{0}+\mathrm{P}_{0} \cdot \mathrm{C}_{0}$
$\mathrm{C}_{2}=\mathrm{G}_{1}+\mathrm{P}_{1} \cdot \mathrm{C}_{1}=\mathrm{G}_{1}+\mathrm{P} 1 \cdot \mathrm{G}_{0}+\mathrm{P}_{1} \cdot \mathrm{P}_{0} \cdot \mathrm{C}_{0}$
$\mathrm{C}_{3} \quad=\mathrm{G}_{2}+\mathrm{P}_{2} \cdot \mathrm{G}_{1}+\mathrm{P}_{2} \cdot \mathrm{P}_{1} \cdot \mathrm{G}_{0}+\mathrm{P}_{2} \cdot \mathrm{P}_{1} \cdot \mathrm{P}_{0} \cdot \mathrm{C}_{0}$
$\mathrm{C}_{4} \quad=\mathrm{G}_{3}+\mathrm{P}_{3} \cdot \mathrm{G}_{2}+\mathrm{P}_{3} \cdot \mathrm{P}_{2} \cdot \mathrm{G}_{1}+\mathrm{P}_{3} \mathrm{P}_{2} \cdot \mathrm{P}_{1} \cdot \mathrm{G}_{0}+\mathrm{P}_{3} \mathrm{P}_{2} \cdot \mathrm{P}_{1} \cdot \mathrm{P}_{0} \cdot \mathrm{C}_{0}$

Notice that the carry-out bit, $\mathrm{C}_{\mathrm{i}+1}$, of the last stage will be available after three delays (one delay to calculate the Propagate signal and two delays as a result of the AND and OR gate). The Sum signal can be calculated as follows,

$$
\begin{equation*}
\mathrm{S}_{\mathrm{i}} \quad=\mathrm{A}_{\mathrm{i}} \text { XOR B }_{\mathrm{i}} \text { XOR C }_{\mathrm{i}}=\mathrm{P}_{\mathrm{i}} \text { XOR C C } \mathrm{C}_{\mathrm{i}} \tag{9}
\end{equation*}
$$

The Sum bit will thus be available after one additional gate delay. The advantage is that these delays will be the same independent of the number of bits one needs to add, in contrast to the ripple counter.

The carry-look ahead adder can be broken up in two modules: (1) the Partial Full Adder, PFA, which generates $\mathrm{Si}, \mathrm{Pi}$ and Gi as defined by equations 3,4 and 9 above; and (2) the Carry Look-ahead Logic, which generates the carry-out bits according to equations 5 to 8 . The 4 -bit adder can then be built by using 4 PFAs and the Carry Look-ahead logic block as shown in Figure .


4 bit CARRY LOOK AHEAD LOGIC ADDER

A 16-Bit CLA Adder could be constructed continuing along in the same logic pattern, with the MS B corry-out resulting from the OR of 16 AND gates. This would make the 16-Bit CLA Adder as fast as the 1-Bit Ripple Carry Adder.

However, another plausible method to create a 16-Bit CLA Adder would be to ripple the corry-out of a 4-Bit CLA Adder to the corry-in of another 4-Bit CLA Adder, using four 4-Bit modules totd. This would make the 16-Bit CLA Adder os fost as the 4-Bit Ripple Carry Adder. Each 4 bit slices will have a Partia full adder or conditional sum adder, 2 to 1 Multiplexer and look-ahead corry logic.

An agorithm for fost addition -conditional sum addition(CS A) was presented by J.Sklansky early in 1960 .It is possible to design a adder with up to five or six times the ripple adder performonce by using CSA algorithm, but it needs larger size of orea. It is shown that the conditional sum adder has a better power-delay product than other adders for high speed applications ${ }^{22}$. The Condtiond sum addition rules can overcome the carry propagation problems. It generates is tant carriers and us ing these carriers to select the true sum outputs from two simultaneously generated provisiond sums under different corry input conditions. The following table shows the 8-bit addition, where the arrows show the actud corries generated between sections. It is seen that simultaneous additions are performed on all sections independently. The addition process of a n bit adder is completed in $\dagger \mathrm{s}$ teps, where

$$
\begin{equation*}
\dagger=\log _{2} n \tag{10}
\end{equation*}
$$

where n is number of input bits.

| R ipple Carry Addition | Conditional S um addition | Conditional carry Addition |
| :--- | :--- | :--- |


| 1 0 0 1 1 0 0 1  <br> +1 0 0 1 0 1 1 0 0 <br> $s^{0}$ 1 0 1 1 0 1 0 1 <br> $c^{0}$ 0 0 0 0 1 0 0 0 <br> $s^{1}$ 0 1 0 0 1 0 1  <br> $c^{1}$ 1 0 1 1 1 1 0  <br> Sum 1 1 0 0 0 1 0 1 |  |  |
| :---: | :---: | :---: |

An improvement of the conditional sum addition is conditional carry addition, which is shown in table. It als o has no corry propagation problem. The generated distant carriers are used to select the true corry inputs from two simultaneously generated provisiond corriers under different carry input conditions. The arrows show the actual carriers generated between sections The simultaneous carry generations are performed on all section independently. The conditiond carry addition of an 8 bit addition is completed in 3 steps. An extra XOR function of the cout and $S^{0}$ is required to generate the final sum outputs, the final results.

Suppose we have an $n$-bit adder that generates two sums: One sum assumes a carry-in condition of ' 0 ', the other sum assumes a carrin condition of ' 1 '. We can split this $n$-bit adder into an $i$-bit adder for the $i$ LSBs and an ( $n-i$ )-bit adder for the $n-i$ MSBs. Both of the smaller adders generate two conditional sums as well as true and complement carry signals. The two (true and complement) carry signals from the LSB adder are used to select between the two ( $n-i+1$ )-bit conditional sums from the MSB adder using $2(n-i+1)$ twoinput MUXes. This is a conditional-sum adder (also often abbreviated to CSA) [Sklansky, 1960]. We can recursively apply this technique. For example, we can split a 16 -bit adder using $i=8$ and $n=8$; then we can split one or both 8 -bit adders again-and so on.

Figure shows the simplest form of an $n$-bit conditional-sum adder that uses $n$ single-bit conditional adders, $H$ (each with four outputs: two conditional sums, true carry, and complement carry), together with a tree of 2:1 MUXes (Qi_j). The conditional-sum adder is usually the fastest of all the adders we have discussed (it is the fastest when logic cell delay increases with the number of inputs-this is true for all ASICs except FPGAs).


FIGURE : The conditional-sum adder. (a) A 1-bit conditional adder that calculates the sum and carry out assuming the carry in is either ' 1 ' or ' 0 ' T(le) multiplexer that selects between sums and carries. (c) A 4-bit conditional-sum adder with carry input, $\mathrm{C}[0]$.

## PROJECT DESCRIPTION

The project objective is met by dividing 16 bit Addition into 4-bit slices.Each slice contains conditional sum generator ,Look Ahead Carry generator and multiplexer components. All the components are stored in user library. Main program is able to call/access each component and utilize its functionality.

The descriptions of each component are given below.

## Components Used

## 4 bit Conditional Sum Generator(CSG):

The inputs A and B, each of 4 bits, are used in this component to produce Sum when when carry is 0 and Sum when carry is 1.It also produces $\mathrm{C}^{0}$ and $\mathrm{C}^{1}$. This module makes use of propagating $(\mathrm{P})$, generating (G) and transfer( T ) functions.

Where $\mathrm{P}, \mathrm{G}$ and T are given by,
$\mathrm{P}_{\mathrm{i}}=\mathrm{A}_{\mathrm{i}}$ XOR $\mathrm{B}_{\mathrm{i}}$
$\mathrm{G}_{\mathrm{i}}=\mathrm{A}_{\mathrm{i}}$ AND $\mathrm{B}_{\mathrm{i}}$
$\mathrm{T}_{\mathrm{i}}=\mathrm{A}_{\mathrm{i}}$ OR Bi
The VHDL code for this component uses behavioral model.

## Look-ahead carry (LAC) generator:

It produces carry out using $\mathrm{C}^{0}$ and $\mathrm{C}^{1}$ from Conditional sum generator and Carry in.

VHDL Code for LAC uses behavioral model where Carry Out is defined by Cout $=\quad \mathrm{C}^{0}$ or $\left(\mathrm{C}^{1}\right.$ and Cin$)$.

## 4 bit 2:1Multiplexer:

It produces output either $\mathrm{S}^{0}$ or $\mathrm{S}^{1}$ depending on carry signal from Look Ahead carry unit.Here carry acts as select signal for multiplexer.
VHDL code for 4 bit 2:1 multiplexer uses behavioral model with wait statement.It also incorporates timing Model.

## Package

All the components mentioned above are included in separate package called CSA4.It contains port definitions of all the components.VHDL codes for all components and packages are stored in separate folder mylib and this folder is defined as user library when compiling 16 bit adder using Altera MAX II.

## 16 bit Adder using 4 bit CSA slices

16 Bit Adder is developed by slicing 16 bits into four portions. Addition mechanism for each portion is then developed by using the library mylib. Where mylib contains components CSG, LAC, 2:1 mux. The inputs are $A, B 16$-bit vectors and a Cin bit with outputs being 16-bit sum and a Cout bit. The three components CSG,LAC and 2:1 mux are called in each bit slice and the carry out from each bit slice is passed as carry in to next bit slice. Carry out of the last bit slice is Cout. Thus the complete operation of the 16 bit addition is performed by using the components in the package stored in the library.

The 16 bit Adder using the components in the package CSA4 is shown below.


16 BIT CONDITIONAL SUM ADDER

## Test bench

Simulation can also be done by providing data needed using a test bench. In simulation, ports of the program being tested are mapped with test bench. Here Test bench provides inputs A, B and Cin .Outputs Sum and Cout are generated when main module is called from the test bench. Test bench uses configuration specifications to bind values.

The test bench could be written in such a way that it provides random input values for A and B . The test bench can also be written to detect errors in the circuit and thus aborting the simulation when incorrect results are obtained.

## ANALYSIS OF SIMULATION RESULTS:

The following table shows the inputs and corresponding results executed from the 16 bit adder mentioned above.

| INPUT |  | OUTPUT |  |  |
| :---: | :--- | :--- | :--- | :--- |
| A | B | Cin | Sum | Cout |
| Hex | FFFF | 0001 |  | 0001 |
| Binary | H111111111 | 0000000000000001 | 1 | 0000000000000001 |

The waveforms resulted by simulation are given in appendix.

## CONCLUSION

The 16 bit addition is performed by using four 4 bit slices which use conditional sum adder and look ahead carry logic.

This leads to production of faster adder than conventional ripple carry Adder due to decrease in carry propagation delay.

## Reference

1. Weihua Chen,"Implementation and Comparison of 16-bit Look Ahead Adders", project report for EE8053 Computer Arithmetic Algorithm.
2. Jungang Han and Glen Stone,"Implementation and verification of conditional sum adder", 1988-311-23,July 1, 1988,Department of computer science Reports.
3.Anantha Chandrakasan,Robert W.Broersen,"Minimizing Power consumption in digital CMOS circuit", in proceedings of IEEE,Vol 83,No 4,pp 498-523,April,1995.
3. M. Horauer and D. Loy, "Adder Synthesis", Proceedings of Austrochip '95, Graz

Austria, pp. 81--87, 1995.

