Majority-layered T hybridization using quantum-dot cellular automata

: The atomistic quantum-dot cellular automata (QCA) based implementations of the reversible circuits have got tremendous exposures in the last few days, due to “room-temperature workability” of the QCA. The researchers are in serious need of a methodology that can realize the area-efficient QCA counterparts of reversible benchmark circuits. In this work, a novel methodology named majority-layered T hybridization is proposed to synthesize the reversible circuits using QCA. Firstly the reversible library consisting of CNTS Gates have been generated to validate the usability of the proposed methodology. Then, an elementary QCA module of 3×3 Toffoli Gate have been proposed and extended in the realization of 4×4, 5×5 and 6×6 Toffoli Gates (multi-control Toffoli Gates). The proper mathematical modelling of the several QCA design metrics like effective area, delay and O-cost has been established. The QCA counterpart of 3×3 Toffoli Gate reports 18.61% less effective area and 8.33% less O-cost compared to the previous Toffoli Gate designs. Moreover, the QCA layout of rd - 32 reversible benchmark using multi-control Toffoli Gate has been employed to verify the scalability and reproducibility of the proposed


PUBLIC INTEREST STATEMENT
The fundamental concern of this work is to find out the solution in the implementation of reversible computation using quantum-dot cellular automata. Among the other emerging nano-technologies, the reversible computation has attracted the attention of researchers for its extreme low power dissipation. One-to-one mapping between input and output vector elements of reversible circuits which provides excellent controllability becomes the prime cause of ultra-low power dissipation. The base of the reversible computation is reversible gates. The realization of reversible gates using conventional CMOS technology was not possible due to its inadequate power constraints. The pivotal significance of choosing the quantum-dot cellular automata based realization of reversible circuits lies in the facts of extreme low-power dissipation, high operating frequency and high packing density. This crucial article proposes a novel methodology, named majority-layered T hybridization to realize the multi-control Toffoli reversible circuits by using quantum-dot cellular automata. The mathematical formulations of generic (n×n) Toffoli Gate design metrics have been demonstrated through the realization of the 3×3 Toffoli unit.

Introduction
Being a pivotal circuit design metric, the power dissipation compels the scientists to think of reversible computation. This novel computing on the basis of emerging nanotechnology which maps oneto-one input to outputs, exhibits the energy dissipation beyond the bound of SNL (Bennett, 1973;Sen, Dutta, & Some, 2014). Beyond SNL, the irreversible computation can be converted to reversible one with the factors of identical circuit efficiencies and energy dissipations for both the cases. With the reversible computation, the quantum-dot cellular automata (QCA) has attracted the researchers for its extreme low power dissipation, high operating speed and below ultra-high packing density. In QCA, the information flows from one level to the next by the Columbic interactions between the electrons instead of conventional current flow. As the Silicon dangling-bond based semiconductor QCA becomes practically feasible in room temperature, so reversible circuits implementation using QCA have been observed in last two years (Dilabio, Wolkow, Pitters, & Piva, 2015;Tougaw, Will, Graunke, & Wheeler, 2009).
The processor based architecture of a digital systems need arithmetic logic unit (ALU) as the core element. Several gates may be basic gates, or may be universal gates, are connected accordingly in order to perform the arithmetic and logical operations of ALU. In classical digital circuits AND, OR, NOT, NAND, NOR, Ex-OR, Ex-NOR gates are the elementary gates. Like such elemental gates of classical circuits, reversible circuits also require certain gates like CNOT, NOT, Toffoli, SWAP Gates (CNTS Gates) to implement reversible circuits. These gates of reversible computation form the reversible library. The Toffoli Gate has been an elementary gate through which the characterization of entire reversible library becomes viable (Wille, Große, Teuber, Dueck, & Drechsler, 2008). One important way to implement the reversible benchmark lies through the realization of multi-control Toffoli Gate. Several Toffoli Gate using QCA have been reported in literature (Abdullah-Al-Shafi, Shifatul, & Newaz, 2015;Bahar, Habib, & Biswas, 2015;Chandra & Netam, 2012;Chaves, Silva, Camargos, & Vilela Neto, 2015;Cvetkovska, Kostadinovska, & Danek, 2013;Iqbal & Banday, 2015;Kunalan, Cheong, Chau, & Ghazali, 2014;Mahalakshmi, Hajeri, Jayashree, & Agrawal, 2016;Moustafa, Younes, & Hassan, 2015;Rolih, 2013;Shabeena & Pathak, 2015) which lag in optimal design area and delay optimization. This work introduces the majority-layered T hybridization which needs the majority voter  and the Layered T Gate (Mukherjee et al., 2015) for effective implementation. Primarily the proposed methodology extends its idea in the successful realization of CNTS Gates of reversible computation. Then the methodology have been extended to model n×n Toffoli Gate with respect to the circuit design metrics such as number of elemental blocks requirement, area requirement, O-cost and delay optimization. The n×n Toffoli Gate helps in multi-control Toffoli (MCT) Realization of reversible benchmarks (Wille et al., 2008). The rd-32 benchmark acts as one-bit reversible full adder. Designing the rd-32 benchmark circuit confirms the scalability and reproducibility of the proposed methodology in reversible circuits.
The rest of the article has been organized as follows: Section 2 discusses the background of QCA and introduces the majority-layered T hybridization. It realizes reversible library for the proposed hybridization methodology. Section 3 successful realizes and simulates n×n Toffoli gates through 3×3, 4×4, 5×5 and 6×6 Toffoli Gates. Moreover, this section formulates the effective area, O-cost, delay and number of majority AND blocks in terms of number of inputs to the Toffoli Gate. Section 4 introduces the rd-32 benchmark circuit realization using majority-layered T hybridization. The detailed statistical analysis of CNTS Gates, Toffoli Gates and the rd-32 benchmark in terms of QCA design metrics have been analyzed in Section 5. Lastly Section 6 concludes the work and discusses about the future scope of majority-layered T hybridization.

Background of QCA
The QCA technology employs quantum cells. The squared shaped quantum cells have four quantum dots within it. Two excess electrons arrange themselves to occupy the corner positions within the square. The CMOS technology didn't employ the Majority Voter due to certain limitations of hardware. The Majority Voter has been implemented as the basic primitive in QCA technology (Walus, Schulhof, Jullien, Zhang, & Wang, 2004). The majority voter takes three inputs A, B and C, evaluates the output as AB + BC + CA and forwards the result to output cell. The state-of-the-art Inverter and QCA binary wire are the other elemental blocks in QCA. The functionality of inverter which inverts the input for the next intermediary levels is identical to the CMOS Inverter whereas the QCA binary wire copies input to the output. The figures of the majority voter, inverter and QCA binary wire is shown in Figure 1

Layered T Gate
After the majority logic reduction technique, several OCA logics have been reported in the literature among which only Layered T Gate employs universal NAND/NOR operations in multi-level digital circuit design. The Layered T Gate have two inputs A, B and one output Z as shown in Figure 2. Depending upon the polarization of upper layer cell, the Gate produces either NAND or NOR waveforms. The "+1" polarization of upper layer cell forces the Gate to operate as AB but the same block would generate A + B if upper layer cell is polarized at "-1" (Mukherjee et al., 2015).

Majority-layered T hybridization methodology
The QCA layout generations of reversible circuits require few points to keep in mind. These worthwhile points are given as below: (b) The reversible circuits would not have memory forming loops. The presence of such loops make the circuit no longer reversible one (Nielson & Chuang, 2011).
The QCA counterparts of the reversible circuits in this work consider the aforementioned points. The Figure 3 extends the concepts of QCA implementation of reversible circuits where horizontal lines of Reversible circuits are replaced by Binary wires. The target bits and control bits are applied through the horizontal lines. The input "1" at control character "•" forces the inverter to flip the input. The target character "⊕" equivalents to Exclusive-OR (Ex-OR) Gate. The Ex-OR Gate inverts its one input if and only if other one is set to logic high. From Figure 3(a), it is noticed that bits A, B are control bit and Z is  target bit. As logic "1" at control bit makes the target bit to be operated as Ex-OR, so the target bit Z becomes B⊕A. Figure 3(b) demonstrates the 3×3 Reversible unit as an example. The Logical AND is represented by using red circle on Reversible circuit and its QCA counterpart as shown in Figure 3(b). A single majority voter as AND Gate  and Layered T Ex-OR module (Mukherjee et al., 2015) are used as the symbols "•", "⊕" represents AND, Ex-OR operations respectively. The use of majority and layered T units in elemental reversible circuit makes the name "majority-layered T hybridization".
The majority-layered T hybridization methodology proposes an algorithm, named Reversible_to_ QCA to generate QCA counterpart of reversible circuits. This algorithm requires following sensitivity list: The symbols "•" and "⊕" fetched during the traversal of reversible circuits are control-reversible, target-reversible. The control-reversible from the most significant input of Reversible circuit would be replaced by QCA Cell Array. But the other control-reversible would be replaced by Maj_AND i.e. "QCA Majority Gate operating as AND Gate". During the segmentation of Reversible Circuits, the presence of target-reversible will be converted to QCA Layered T Ex-OR module. The algorithm Reversible_to_QCA is given as follows: Algorithm1: Reversible_to_QCA (control-reversible, target-reversible, QCA Cell Array, Maj_AND, LT_Ex-OR) Step 1: Segment the Reversible Circuit into columns consisting at least one spur Step 3: Continue the Step 2 until level ends.
Figure 3(a) shows the 2×2 Reversible Circuit with its QCA Layouts. The realization of 2×2 reversible circuit by using majority-layered T hybridization needs a Layered T Ex-OR Gate and QCA binary wire while 3×3 Reversible Circuit conversion to QCA requires an extra majority AND Gate along with a Layered T Ex-OR Gate and QCA binary wire. This is indicated by using red circle and red box in Figure 3(b).

Realization of CNTS Gates
The usability, reproducibility and scalability of the majority-layered T hybridization requires the implementation of universal gate library (Shende, Prasad, Markov, & Hayes, 2002) consisting of CNOT, NOT, 3×3 Toffoli and SWAP Gates (CNTS Gates). The NOT Gate inverts input to produce the output Z. The QCA Implementation of NOT gate needs a Layered T Ex-OR Gate with an input fixed at logic 1. The Reversible NOT and its QCA Layout are shown in Figure 4(a) and (e) respectively. The Reversible CNOT Gate which has two inputs A, B is illustrated in Figure 4(b). The logic 1 at control bit A forces the other input B to toggle. On the other hand, the control bit A set to logic 0 forces CNOT output Z to copy the target bit B. The Majority-Layered T conversion of CNOT Gate requires one Layered T Ex-OR Gate as shown in Figure 4(b) and (f). The multi-control Toffoli realization of the reversible library (Wille et al., 2008; http://www.revlib.org) shows inevitable significance of the 3×3 Toffoli Gate as an elementary reversible gate. The outs Z1 and Z2 are following the inputs A and B. But the third output Z3 of Toffoli Gate copies the input bit C if and only if both the inputs A and B are true. It means the Gate follows the equation Z3 = C⊕AB (Zilic, Radecka, & Kazamiphur, 2007). The term AB Ex-ORed with input C clears the requirement of a Majority AND Gate and a Layered T Ex-OR as demonstrated in Figure 4(c)-(g). The SWAP Gate copies A, B to Z2, Z1 respectively. The QCA layout generation of SWAP Gate using the proposed methodology confirms three cascaded Layered T Ex-OR Gates as shown in Figure 4(h). The summary of CNTS QCA layouts of Figure 4(e)-(h) is reported in Table 1. The implementations of the Standard Functions (Mukherjee, Roy, Panda, & Maji, 2016) confirm the requirements of low AUF, low O-cost and optimal delay QCA circuits. The NOT, CNOT and Toffoli Gates need 0.75 clock-delays to get the outputs as observed from Table 1. The cascaded connections of three Layered T Ex-OR Gates result 2.75 clock-delays for the SWAP Gate. The CNTS Gates implementation by using majority-layered T hybridization generates multilayer QCA layouts. The multilayer QCA layouts guide the researchers along the better reliability and stability during the fabrication of QCA layouts (Kumar, Ghosh, & Gupta, 2015). The input A = "01010101" has been inverted to Z as "10101010" as given in Figure 5(a). The CNOT Gate output of Figure 5(b) shows that Z1 has got the A Ex-ORed B waveform whereas the input has been copied to Z2. The Toffoli Gate output has been captured in Figure 5(c). The vectors "00001111", "00110011" at Z1, Z2 respectively conforms the input vectors A, B. The remaining output Z3 which resembles C Ex-ORed AB has got the vector "01010110".
The SWAP Gate of Figure 4(d), exchanges the inputs A, B to get the vectors "00110011" and "01010101" as observed in Figure 5

Realization of generic Toffoli Gate
The rigorous experimental works (Dilabio et al., 2015;Tougaw et al., 2009) prove the operability of the QCA in a room temperature. The reversible gate layouts of QCA have already been proposed. Still in this field, their exist lack of optimal design of generic Toffoli Gate. As the reversible library has been synthesized by using MCT Gates (Wille et al., 2008; http://www.revlib.org), so recently the utmost importance have been given to n×n Toffoli Gate design.

3×3 Toffoli Gate
An optimal design of 3×3 Toffoli Gate which is further used in the design of higher order Toffoli Gates has been introduced in this section. The proper functionality of 3×3 Toffoli Gate confirm the presence of three inputs A1, A2, A3 and three outputs Z1, Z2, Z3. The control inputs A1, A2 have been duplicated in the outputs Z1, Z2. The remaining output Z3 equals A1A2⊕A3. According to the definition of nth order Toffoli Gate, the least significant (n−1) bits are copied to first (n−1) outputs Z1, Z2, ..., Z(n−1). At the same time, the most significant nth output produces A1, A2, A3, ..., A(n−1)⊕An (Zilic et al., 2007).  Table 2. In the proposed Hybridization methodology, the intermediary output of the majority AND A1 AND A2 has been forwarded to one of the inputs of the Layered T Ex-OR Gate. Finally the Layered T Ex-OR takes A3 to evaluate the final output.

n×n Toffoli Gate
The lower order 3×3 Toffoli of Figure 6 is reproduced to generate higher order Toffoli Gates like 4×4, 5×5 and 6×6 Toffoli Gates using majority layered T hybridization methodology. The implementation of higher order Toffoli using 3×3 Toffoli unit observes two subsequent stages. Firstly least significant control bits are ANDed by using majority voter to generate A1, A2, A3, ..., A(n−1). Then the intermediary output is Ex-ORed with A(n) in the final stage by using Layered T Ex-OR Gate. The mathematical model of generic Toffoli Gate has been employed as follows:

Lemma 1 The implementation of n×n Toffoli Gate using QCA requires (n−2) number of majority AND Gates.
Proof The application of the proposed methodology in n×n Toffoli Gate synthesis, the (n−2) majority AND Gates are required. From Figure 7(a), it can be observed that 4×4 Toffoli Gate implementation of QCA needs (4 − 2) = 2 majority AND units. Here 4×4 makes the input variable n as 4. These two majority AND takes inputs A2, A3, A4 and produces A2A3A4. Like 4×4 Toffoli Gate, the 5×5, 6×6 Toffoli Gates require (5 − 2) = 3, (6 − 2) = 4 majority AND Gates respectively.  Proof The intermediary output A2A3A4A5 which is further processed as one input of Layered T Ex-OR module is generated for 5×5 Toffoli Gate. The control bit A1 has been given to the other input of Layered T Ex-OR as demonstrated in Figure 7(b). The 6×6 unit employs (6 -2) = 4 majority ANDs to produce A2A3A4A5A6. Hence for (n−2) majority ANDs, the A2A3...An would be the intermediary output.
Lemma 3 The QCA implementation of n×n generic Toffoli Gate requires 0.25n clock to evaluate the target bit Z n .
Proof Figure 6 shows the requirement of 31,584 nm 2 effective area for the 3×3 Toffoli unit. Similarly, the 4×4, 5×5, 6×6 Toffoli Gates need 53,349, 66,507, 79,665 nm 2 effective areas respectively. These area requirements can be rewritten as 20,808 + 13,008(4 − 3), 20,808 + 13,008(5 − 3), 20,808 + 13,008 (6 − 3) nm 2 . Hence for input n > 3, the effective area becomes 20,808 + 13,008(n − 3) nm 2 . The formulations for n×n Toffoli Gate have been simulated by taking the variations of the n from 3 to 10. The QCA circuit design metrics (Mukherjee et al., 2016) which have been validated in Lemmas 1-5 are listed in Table 3. The Figure 8 reports O-cost curve, delay curve, area curve and AUF Curve for n×n Toffoli Gate design byusing majority-layered T hybridization Methodology. Figure 9(a)-(d) show the outputs of the 3×3, 4×4, 5×5 and 6×6 Toffoli Gates respectively. From Figure 9(a), the outputs Z1, Z2, Z3 have the vector patters of "0101011001010110", "0011001100110011", "0101010101010101" in accordance with the functionality of the 3×3 Toffoli units. The red-box of Figure 9(a) reports the fact of Z1 as A1A2⊕A3. The output 3×3 Toffoli unit which has been shown by the grey box of Figure 9(a) gets the appropriate value at the negative edge of clock 2. Figure 9(b) provides the functionality of the 4×4 Toffoli Gate. The negative edge of clock 3 which produces the outputs of 4×4 Toffoli unit is emphasized by using the both sided arrow. The outputs Z2, Z3, Z4 follow the vector patterns "00001111000011110000111100001111", "00110011 001100110011001100110011", "01010101010101010101010101010101" of the inputs A2, A3, A4 respectively. The bit Z1 produces 4×4 Toffoli output as "01010101010101010101010101010110"  confirming the appropriateness of Toffoli unit. The operation of bit Z1 has been signified by indicating red portion of the Figure 9(b).
The operability of 6×6 Toffoli Gate has been validated as shown in grey, red colored box of Figure 9(d). The negative edge of clock 1 is generates the outputs of 6×6 Toffoli unit by making the neat delay of 1.50.

The rd-32 circuit realization using majority-layered T hybridization
The validity of majority-layered T hybridization Methodology becomes unquestionable through the verification of the proposed methodology in Benchmark realization. The previous sections discuss CNTS Gates creations and Generic Toffoli Gate implementations. The benchmark circuit rd-32 (Feynman, 1986) is extensively tested and verified with the proper layout by using the proposed hybridization methodology.

The rd-32 benchmark implementation using majority-layered T hybridization
The benchmark function rd-32 functions as one-bit Full Adder (Feynman, 1986). The rd-32 take four inputs A 1 , A 2 and A 3 , A 4 and produces four outputs Z 1 , Z 2 , Z 3 , and Z 4 . The input bit A 4 is fixed at logic "0" as mentioned in Figure 10. The rd-32 computes one-bit addition of the inputs A 1 , A 2 and A 3 causing Z 3 = A 1 ⊕A 2 ⊕A 3 . On the other side, the output Z 4 equals the Sum-of-Products A 1 A 2 + A 2 A 3 + A 3 A 1 and confirms the Carry output of the one-bit full adder circuit. The rd-32 results the garbage outputs Z 1 , Z 2 with conventional full-adder outputs Sum and Carry unlike classical one-bit Full Adder.

The QCA layout of the rd-32 benchmark
The requirement of an extra upper layer with main cell layer makes the rd-32 QCA layout multi-layer circuit. Figure 11 demonstrates the QCA realization of the rd-32 benchmark circuit. The QCA Layout of rd-32 which has 3.75 clock-delays evaluates outputs at the negative edge of the clock 2. The QCA design summary of the proposed rd-32 benchmark is reported in Table 4.

Result analysis
The proposed hybridization methodology requires the Layered T Ex-OR Gate along with the employment of the majority voter as AND Gate. The 2×2 reversible element has one Layered T Ex-OR Gate but the majority voter with Layered T Ex-OR module have been used in 3×3 reversible unit. Figure  3(a) and (b) show the 2×2 and 3×3 reversible elements respectively. The optimal designs of CNTS Gates in terms of O-cost, Effective Area and Delay have been communicated in this work to build the Reversible Library. The analysis of Figure 12 shows the dependency of AUF on effective area and O-Cost. The NOT Gate have shown the lesser effective area and O-cost with the higher value of AUF. The requirements of three successive Ex-OR Gates have made higher O-cost, Effective Area and Delay as shown in Figure 12. The area utilization of SWAP Gate has attended the excellent optimization as the AUF of 2.75 becomes the lowest among the four CNTS Gates.
The analysis of Table 2 reports 18.61% less requirement of effective area and 8.33% less QCA Cells compared to the previously reported design (Moustafa et al., 2015) so far. The statistical improvements of the effective area and O-cost have been demonstrated in Figures 13(a) and (b). The proposed 3×3 Toffoli Gate of Figure 6 possesses lowest value of AUF among the previous 3×3 Toffoli Gate designs (Abdullah-Al-Shafi et al., 2015;Bahar et al., 2015;Chandra & Netam, 2012;Chaves et al., 2015;Cvetkovska et al., 2013;Iqbal & Banday, 2015;Kunalan et al., 2014;Mahalakshmi et al., 2016;Moustafa et al., 2015;Rolih, 2013;Shabeena & Pathak, 2015) as communicated in Figure 13(d). The lower value of AUF means the higher utilization of effective area in QCA layouts. The best previous design (Moustafa et al., 2015) has the AUF as 3.32 but in the proposed design, the AUF reduces to 2.95. Hence the Toffoli Gate of Figure 6 has been proposed for the further implementation of Reversible Logic Synthesis (Shende et al., 2002; http://www.revlib.org) as the lower AUF QCA layouts are also desirable (Mukherjee et al., 2016).  The Section 3.2 discusses 4×4, 5×5 and 6×6 Toffoli Gates by using optimal design of 3×3 Toffoli Gate. The use of 3×3 Toffoli Gate validates the generic sense as this work have dealt with the higher order Toffoli Gates with inputs up to n = 10. The curves for several QCA design metrics as shown in Figure 8 have been generated by keeping Lemma 1-5 in mind. The linear natures have been observed in the O-cost and delay curves. The AUF curve of generic Toffoli Gate has interesting characteristics to show. The higher orders Toffoli have lower AUFs than the lower order Toffoli units. The values of AUFs for the Toffoli Gates have been listed in the Table 3. The usefulness of lower AUF Toffoli has been justified as multi-control Toffoli unit instantiates higher order Toffoli Gates.
The layout of rd-32 Benchmark Circuit which reports the AUF value as 4.802 has been generated by using the majority-layered T hybridization methodology as mentioned in Figure 11. The QCA layout of the rd-32 Benchmark is first communicated in this work as the best of the author's knowledge.

Conclusion
A novel methodology, named majority-layered T hybridization is proposed in this work to convert the reversible circuits to QCA layouts. The validity of the proposed hybridization methodology has been established by creating the reversible library of CNTS Gates. The higher order Toffoli Gates has been generated to confirm the scalability of the proposed methodology in the multi-control Toffoli realization of reversible circuits. This work communicates the detailed analysis of 3×3, 4×4, 5×5 and 6×6 Toffoli Gates with the appropriate output graphs. The mathematical formulations of the design metrics like the effective area, O-cost, delay of generic Toffoli units help the researchers to implement MCT (multi-control Toffoli) reversible circuits. The formulations are considered along with the effective area, O-cost, delay and AUF graphs of generic Toffoli Gates. The QCA Layout of rd-32 Benchmark Function generated by using the proposed Methodology is first reported to the best of the author's knowledge. As the rd-32 circuit has lower AUF value, so the implementation of entire reversible benchmarks along with the other reversible circuits excel in terms of effective area, O-cost, AUF and delay. The advancement of quantum information processing and reversible logics implementations would achieve another milestone if the majority-layered T hybridization methodology is adopted.