Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder ( MSQRTCSLA ) 1

Multiplication and Accumulation (MAC) unit is recognized as high potential in every Digital Signal Processor (DSP). In MAC unit, both Multiplication and Accumulation functions are involved, but the performances of MAC unit is mostly depends on dataflow structure of Accumulation unit. In this study, Modified Square Root Carry Select Adder (MSQRTCSLA) is designed through Very Large Scale Integration (VLSI) System design environment. In the proposed design, Half Adder (HA) and Full Adder (FA) circuits are realized and identified the redundant logic functions. Hence, a new half adder named “Reduced Half Adder (RHA)” and a new full adder named “Reduced Full Adder (RFA)” are proposed in this study. Further the design of RHA and RFA is integrated into Binary to Excess-1 Converter (BEC) based SQRT CSLA architecture to improve the accumulation function of MAC unit. A new BEC based SQRT CSLA architecture is named as “Modified Square Root Carry Select Adder (MSQRTCSLA). Low power consumption, High Speed and Less area utilization are the main key factors in VLSI System design environment. Therefore, Minimizing the Area-Delay Product (ADP) of MSQRTCSLA is the main goal of this study. MSQRTCSLA based accumulation structure offers 22.86% reduction of delay and 8.87% reduction power consumption than conventional BEC based SQRT CSLA based accumulation structure.


INTRODUCTION
In VLSI System design environment, reducing the chip size, power consumption and increasing the speed are the main goal.High speed VLSI Systems are increasingly used in Multimedia devices, Multistandards, Portable Mobile devices, Signals and Image Processing approaches.Memory and Processor Core are the main key factors which make VLSI System as powerful one.In case of Memory Core VLSI System design, less and utilization sharing based Reduced Instruction Set Processors (RISC) processors are used to reduce the chip size (in terms of Memory and Look-Up Table (LUT)) and power consumption.Like Memory Core, Processor Core also used to reduce the chip size (in terms of Slices and Registers).Unlike Memory Core, Processor Core uses reduced logics based RISC Processors.In every Processor Core, both Arithmetic and Logic Unit (ALU) and Multiplication and Accumulation (MAC) unit performs most of logic functions.Hence, ALU and MAC units are called as heart of every processor.In both ALU and MAC, most of the logics are performed only based on accumulation structures.Hence, an efficient structure of accumulation is the important essential part in VLSI Core design.
One of the basic VLSI based Accumulation structures is Ripple Carry Adder (RCA).It performs the accumulation function very well, but in every stage it must be wait for generating carry from previous stage.Hence, RCA accumulation structure causes more Carry Propagation Delay (CPD) to perform binary addition.In order to reduce the CPD Delay, Carry Look-ahead Adder (CLA) is designed in Wang et al. (2002).CLA adder effectively reduces the CPD delay, but it utilizes more hardware to reduce the delay.But, reducing both hardware utilization and delay consumption is the essential part of VLSI System core design.A lot of research works have been suggested the Carry Select Adder (CSLA) to reduce both hardware utilization and delay consumption of Accumulation structures.For instance, best Square Root Carry Select Adder (SQRT CSLA) is designed in Mohanty and Patel (2014).
In this research work, Modified Square Root Carry Select Adder (MSQRTCSLA) circuit is designed with the help of Verilog Hardware Description Language (Verilog HDL).Evaluated Synthesis Performances are better than conventional Binary to Excess-1 Converter (BEC) based SQRT CSLA designed in Mohanty and Patel (2014).In the proposed designed a new half adder and full adder circuits are introduced to reduce the complexity of data flow structures.

LITERATURE REVIEW
The design of hybrid Carry Look-ahead Adder (CLA) is done in Wang et al. (2002).In this review, 56bit hybrid CLA adder is designed with the help of static CMOS design.The critical path of this design reduces 2/3 of the critical path lengths of RCA adder.However, it consumes large area utilization than RCA circuit.In order to reduce this problem, CSLA Adder has been suggested by large endeavours.Tyagi (1993), a reduced area scheme for CSLA adder has been proposed.In this review work, delay has been reduced to 25 ns for performing two 16-bit addition operations.This review uses combined structure of carry skip and parallel prefix adder to perform the addition operation of CSLA.Power consumption has been increased due to using skipping adder.In order to overcome this problem Power-Delay efficient hybrid adder structures are developed in Nève et al. (2004) and He and Chang (2008).In those adder structures, 2's complement functions are used to develop the hybrid CLA and CSLA structures.In addition, Variable Length (VL) -Adder deign has been proposed in Chen et al. (2010) with the help of hybrid structures proposed in Nève et al. (2004) and He and Chang (2008).Ramkumar and Kittur (2012) group structures based SQRT CSLA adder has been proposed for reducing the gate count of adder design.More than 50% of gates are reduced in the design of Ramkumar and Kittur (2012) than design of proposed previous adders.Based on this adder structures, effective parallel adder structures has been proposed in Mary andRenji (2014), MoosaIrshad et al. (2014) and Mohanty and Patel (2014).
In Mohanty and Patel (2014) and Avuthu et al., (2015), an efficient design of Full adders and Half adders based group structures are proposed for BEC based SQRT CSLA architecture.This is the best work in 2014 for adding two N-bit binary data.In this research work, design of Mohanty and Patel (2014) is considered as conventional technique.Further Multiplexer based Full Adder used in Anna et al. (2015) for digital FIR Filter is also considered.In this modification, delay for accumulation function has been reduced to 21.816ns.Further the enhanced low power Gate Diffusion Input (GDI) logic based adder has been proposed in Anitha et al. (2015).GDI based CSLA adder produce 455 mW power, which is better than RCA adder power.
Ripple carry adder: Ripple Carry Adder (RCA) is one of the best VLSI based Adders which performs two Nbit binary additions with the help of N Full adder circuits.Most disadvantage of this accumulation structure is CPD delay.This delay has been occurred in each stage due to waiting for generating carry bit from previous stage.The architecture of 4-bit RCA is illustrated in Fig. 1.
In Fig. 1, Carry output of second 1-bit full adder must be waiting for generating Carry Input (C1) from first 1-bit full adder.Similarly third and fourth 1-bit full adder must be waiting for generating carry input (C2) from second and third 1-bit full adder respectively.Hence, RCA adder requires more CPD delay for performing N-bit addition process.In order to reduce this problem, Carry Select Adder (CSLA) is preferred in lot of endeavours.
Carry select adder: Carry Select Adder is a type of parallel adder in which N-bit binary data is divided into ˚ groups for performing addition process.Each and every group can execute concurrently based on inputs.Hence, CPD delay can be reduced to ˚ times than RCA circuits.Hence, it is used to alleviate the architectures in terms of VLSI main concerns.It has two general architectures named as "Dual RCA Based Carry Select Adder" and "BEC Based Carry Select Adder".

Dual RCA based carry select adder:
The structure of dual RCA based CSLA circuit is illustrated in Fig. 2. As the name itself, it uses the dual sets of RCA to perform the addition operation.For instance, in 16-bit dual RCA based CSLA circuit uses four groups to perform addition operation.Each group can be executed in a parallel manner.Each and every group has dual RCA pairs for C in = 0 and C in = 1 respectively.Finally,
Theoretical evaluation of gate count for conventional BEC based SQRT CSLA and Proposed MSQRTCSLA: In 16-bit BEC based SQRT CSLA circuit, 4 groups are used to perform the addition operation.Each group has both RCA and BEC circuits.Similarly, 16-bit MSQRTCSLA circuit has also 4 groups to perform the addition operation.
MSQRTCSLA circuit uses both RHA and RFA circuits effectively.The gate count calculation for each and every group structures of both conventional BEC based SQRT CSLA and proposed MSQRTCSLA circuits are analyzed theoretically in Table 2. Table 3 illustrates the percentage reduction of gate count Proposed MSQRTCSLA circuit.

SYNTHESIS RESULTS AND DISCUSSION
Design of Reduced Half Adder (RHA) and Reduced Full Adder (RFA) has been done through Verilog HDL.Proposed RHA and RFA circuits are to be integrated in 16-bit conventional BEC based SQRT CSLA circuits to alleviate the performances of SQRT CSLA circuit.Hence, this circuit named as "Modified SQRT CSLA".Simulation Results have been validated by using ModelSim 6.3C tool.The Simulation results of VLSI based Proposed 16-bit MSQRTCSLA is illustrated in Fig. 10.Register Transfer Level (RTL) view for Proposed MSQRTCSLA circuit is illustrated in Fig. 11.Detailed RTL view for each every group structure of Proposed MSQRTCSLA circuit is illustrated in Fig. 12.  Synthesis results have been evaluated by using appropriate tools for measuring the utilization of hardware, delay and power of Proposed MSQRTCSLA circuit.Synthesis Results of both conventional 16-bit BEC based SQRT CSLA and proposed 16-bit MSQRTCSLA circuit is analyzed and compared in Table 4.The performance evaluations are graphically illustrated in Fig. 13.
When compared to results of Tyagi (1993) and Anna et al. (2015), Proposed MSQRTCSLA circuit offers 50.72 and 43.53% reduction in delay consumption respectively.Similarly, when compared to the results of Mary and Renji (2014) and Mohanty and Patel (2014), Proposed circuit offer 34.36% reduction in delay consumption.Hence, from above consecution, it is clear that, Proposed MSQRTCSLA circuit gives high speed operation than all other best existing methods.

CONCLUSION
Reduced Half Adder (RHA) and Reduced Full Adder (RFA) are proposed in this study to improve the speed and power consumption of BEC based SQRT CSLA adder circuit.The reduced number of gates of this study provides the great advantage in the reduction of delay and power consumption.Proposed Modified SQRT CSLA adder circuit offers 7.14% reduction of Slices, 2.12% reduction of LUTs, 22.86% reduction of maximum input arrival times, 15.63% reduction of maximum combinational path delays and 8.87% reduction of power consumption than conventional BEC based SQRT CSLA circuit.The Area-Delay Product (ADP) and Power-Delay Product (PDP) of proposed MSQRTCSLA design shows great advantage than conventional BEC based SQRT CSLA.The proposed MSQRTCSLA architecture is therefore, high speed, low area, low power, simple and an efficient for VLSI hardware implementation.In future, proposed MSQRTCSLA adder circuit will be integrated in different types of MAC units to alleviate the performances of MAC in terms of VLSI main concerns.Also Proposed MSQRTCSLA adder structure will be absolutely suitable for specific digital signal processing applications like Filtering, Frequency transformation techniques and Wireless digital communication for performing digital addition process.

Table 1 :
Gate counts for basic blocks of BEC based SQRT CSLA

Table 3 :
Percentage reduction of gate count values in proposed MSQRTCSLA Conventional BEC based SQRT CSLA - -

Table 4 :
Comparison of synthesis performances for both conventional 16-bit BEC based SQRT CSLA and proposed 16-bit MSQRTCSLA