Design of normalised and simpli fi ed FAs in quantum-dot cellular automata

Quantum-dot cellular automata (QCA), which is a burgeoning technique at nanoscale region to take the important place of complementary metal–oxide–semiconductor technology, has been studied for several years. The recent primary research emphasis is mainly focusing on circuit design, for instance, full adders (FAs) and multiplexers. The authors will present a new five-input majority gate to construct FAs. Three QCA normalised FAs based on different logical expressions are then selected and designed, denoted as NFA1–NFA3, respectively. To conveniently compare with existing adders, three simplified FAs are also designed, named SFA1–SFA3. Analysis results indicate that SFA3 presents better performance in some extent.


Introduction
Since the presence of quantum-dot cellular automata (QCA), the theory in this domain has been developed rapidly including the pure research and experimental verification for the practicability of this technology.In theoretical area, most studies focus on circuit design and simulation such as QCA basic logic unit and field programmable gate array, at which full adders (FAs) also dominate, while the physical realisation mostly focuses on the demonstration of QCA cells.The cell can be seen as a square charge container with two free electrons and four dots located at each corner as shown in Fig. 1a.Owing to the Coulomb interaction between electrons attaining lowest electrostatic potential, two polarisation patterns are used to encode binary information, logic '1' (P = + 1) and logic '0' (P = −1), as shown in Figs.1b and c, respectively [1].
A method to get the optimal logical expression based on majority voter for a circuit with three inputs was presented [2].Adopting this means, an FA consisting of three three-input majority voters and two inverters was obtained and also implemented in QCA domain in one layer.With this logical expression for FA, the threelayer FA was also designed [3,4].In addition, several FAs with different layouts have only been implemented with three-input majority voters and inverters [5][6][7][8][9][10][11][12][13].With the varieties of logical expressions for FAs, the circuits can also be designed with other logic units, for instance, five-input majority voters [14][15][16][17][18][19][20][21][22][23][24][25][26][27], three-input exclusive-OR (TIEO) gate [28] and exclusive OR gates [29].Pudi optimised the logical expression to reduce one inverter and to design a new QCA FA [30].The principle of these methods is to take the place of three-input majority voter in FAs.
This paper is organised as follows: Section 2 provides the circuit design guidelines about QCA technology.The analytical methods are provided in Section 3. Section 4 illustrates the designed FAs and the simulation results.Finally, we conclude this paper in Section 5.

Majority voter
The primitive device in QCA is three-input majority voter (denoted as M3) is shown in Fig. 2a.The functionality of majority voter is represented as Boolean function F = MV A, B, C ( )= AB + AC + BC, which means that the output F of a majority voter is equal to the value of majority of the inputs A, B and C in Fig. 2b.Fig. 2c illustrates the simulation results of the voter with QCADesigner.The cell F will receive the signal from device cell at the centre of voter, which means that the input signals arriving device cell should keep synchronous, or one input signal will arrive at the device cell preferentially, resulting in erroneous output values.The five-input majority voter (denoted as M5) plays the same role, but with five inputs.The design principle is same as M3.

Crossover
With the electrostatic interaction between each pair of electrons within two cells, the QCA circuits can be integrated in one layer.These circuits can definitely be built by means of multilayered approach such as complementary metal-oxide-semiconductorbased integrated circuits, while this structure has not yet achieved and the noise problem exists [31,32].Up to now there are two kinds of techniques to implement one-layer crossover.Kim studied the robustness of simple wire crossover consisting of normal cells and rotated cells rotated by 45° [33].The rotated cells, however, are more sensitive to noise casted by other cells than normal cells, easily leading to fault results [34].Shin proposed a new wire-crossing using the relation between hold and relax phases of clocked cells [35].If two adjacent cells are placed in clock0 and clock2, there does not exist Coulomb interaction between these cells, which means the signal cannot be propagated in these cells.That is the main principle of coplanar crossover with clocking mechanism.This crossover can also be realised by clock1 and clock3 pair.

Inverter
Besides the majority voter, inverter is also a key element in circuit to realise the converse of input signal [1].Fig. 3a illustrates a simple inverter.Input cell A and output cell F are placed in same clock zone.The output cell in this design, however, cannot be fully saturated due to the rapid decay of kink energy with distance between cells seen from its simulation results.By adding one more cell before output cell, the amplitude of polarisation can be strengthened as shown in Fig. 3b.

Physical properties
The apparent physical properties to evaluate a QCA circuit is nothing more than circuit area, cell count, latency, layer count and cell types.The area of a circuit is defined as the rectangular area occupied by circuit on the Cartesian plane [36].This figure of merit directly reflects the circuit size that may relate to number of cells about one-layer circuit, which also decides the manufacturing cost of a die.The latency or clock frequency often accounts for the performance for one system, which is decided by the longest path the signal propagated.In one QCA circuit, the latency relates to the number of majority voter that signal experienced from input cell to output cell.As mentioned earlier, the multi-layer crossover can hardly be approached by physical model.The rotated cells are easily affected by noise, leading to fault results, so that one-layer design and normal cells are the best choices.

Power dissipation
To characterise the maximum power dissipation is meaningful by using sharp transitions in clocks.The equation for instantaneous total power for a single QCA cell is written as where G is a real three-dimensional (3D) energy vector and l is the coherence vector.The first term P 1 in above equation represents the difference between power input (P in ) and power output (P out ).The second term P 2 gives the dissipated power (P diss ) that is exactly our concern.The worst case for power dissipation can be obtained during one clock cycle Tc = [−T D , T D ] It is worth noting that the greater the changing rate of G is, the larger amount the energy dissipates.Modelling the rate with δ function and representing G −T D and G T D with G − and G + .
The upper bound power dissipation model in [37] is given as where T is the Kelvin temperature and k B is the Boltzmann constant.
With the power dissipation for a QCA cell, we can calculate the total power dissipation for a system by accumulating the dissipated power of all cells because the presented method for each QCA cell is identical.

Probabilistic transfer matrix
As a method for computing the reliability of a combinational circuit, the probabilistic transfer matrix (PTM) forms an algebra to represent circuits with probabilistic failure gates.The matrix of a gate stems from its truth table, which is a 2 m × 2 n matrix for a circuit with m inputs and n outputs as shown in Fig. 4a crossover and Fig. 4b fanout, respectively.To generate this PTM, the circuit calculated must be divided into subcircuits that do not have a series of components.There are three connecting modes for connection of two components: parallel, if two gates with PTM 1 and PTM 2 are connected in parallel, the combined PTM is the tensor product of PTM 1 and PTM 2 ; series, if two gates with PTM 1 and PTM 2 are connected in series, the combined PTM is the product of PTM 1 and PTM 2 ; if two gates with PTM 1 and PTM 2 are connected by fanout, the combined PTM is calculated by the tensor product of PTM 1 and PTM 2 and also eliminating the rows that have different input values for one fanout [38].
The new five-input majority gate is illustrated in Fig. 5, where C o undertakes two-fold weights.It has only nine cells and can be used in both multi-layer and coplanar circuits.

FAs design
The logical function of FA can be described as (6), where A, B and C are input signals, C o is carried out and S is sum With M3, the sum formula is presented as ( 7) [2].It is clear that this FA consists of three M3 and two inverters By the reformation of formula, another logical expression is presented as (8) [30].Compared to (7), it is reduced by one inverter With one M5, the logical function of sum can also be described as ( 9) The schematics of FAs for ( 7)-( 9) are illustrated in Figs.6a-c, marked as FA1-FA3, respectively.It is worth noting that the input C should be placed to facilitate cascade of n FAs to implement n-bit carry flow adder (CFA).
The corresponding QCA normalised FAs of these layouts are displayed in Figs.7a-c, shortened as NFA1-NFA3.Noting each majority voter, inverter and crossover with characters M3 or M5, I and C, respectively (similarly hereinafter).The spacing between two wires is two cells width and the cell count in one clock zone is two at least.It can be seen that the delay of sum S for NFA1 and NFA2 is 1.25 clocks, whereas the latency of NFA3 is 1 clock.Figs.8a-c show the 4 bit normalised CFAs implemented by cascading four corresponding FAs, denoted as NCFA-NCFA3, respectively.It is clear that the additional clock zone in junction between two adjacent adders will ensure the correctness of transmission signal.
To cater to the previous work of others, we also design the corresponding simplified FAs (SFA1-SFA3) and 4 bit CFAs (SCFA1-SCFA3) that does not conform to aforementioned design guidelines as shown in Figs. 9 and 10, denoted as SFAs and SCFAs, respectively.The main differences from NFAs and SFAs are the spacing between two wires and the minimal cell count in each clock zone.In the wiring of 4 bit SCFAs, we employ the corner line to take the place of straight line to save the circuit area.

Simulation results
To authenticate these devices and circuits designed in this paper, bistable approximation engine of QCADesigner version 2.0.3 with the parameters summarised in Table 1 is employed to verify the functionalities of proposed circuits [31].
Fig. 11 illustrates the correct simulation results for the proposed five-input majority gate in Fig. 5, where C o undertakes two-fold weights.Fig. 12 shows the simulation results for NFA1-NFA3, respectively.Apparently, both two circuits perform correct

Performance comparison
In Fig. 6, we displayed three QCA FAs with different logic units, FA1-FA3, corresponding to various Boolean logic expressions.PTM is used, as above mentioned, to analyse circuit reliability.The overall probability of failure for FAs is shown in Fig. 16a.It is clear that FA3 has better reliability than both FA1 and FA2.Fig. 16b illustrates the probability of failure for S, which show FA2 and FA3 have equal robustness for S. Since three carry outs are generated by a three-input majority voter, so the reliability for three C o s is equal as shown in Fig. 16c.With above analysis, it shows that the FA based on M5 is more robust than other two adders.Table 2 lists the physical properties for QCA FAs that are all coplanar structures, which exclude the adders that inputs or outputs are surrounded by the circuits.As noted before, adders NFA1-NFA3 conform to aforementioned guidelines to design circuits, resulting in more cells than SFA1-SFA3 that are proposed to get optimal FAs.The proposed SFA3 gets smaller area and less cells than all other adders, but has more latency than adder in [25] by 0.25 clocks and adder in [28] by 0.5 clocks.
As displayed in Table 3, the comparisons of average energy, leakage energy, switching energy dissipations with previous works are shown by considering three different tunnelling energy levels (0.5E k , E k , 1.5E k ) at 2.0 K temperature, respectively.Figs. 17  Fig.20 shows the power dissipation maps of these FAs at 0.5E k tunnelling energy level and at 2.0 K temperature.In this figure, the darker the cells are, the more energy dissipates.Yet, there is no relevance to each FA in the cells colour.
Table 4 summarises the physical properties for 4 bit QCA CFAs in terms of area, cell count and latency.The larger area and cell  count for NCFAs derive from straight input/output wires and also the minimal cell count in each clock, which can be seen from SCFAs designed with bent wires, resulting in smaller area.Although SCFA3 has a bit more latency, it gets smallest area and least cell count.

Conclusions
Primo, in this paper, we detailed the circuit design guidelines by simulation and evaluation methods to restrict the building block for constructing a normalised circuit system.To explain how these guidelines and methods work, three normalised FAs named NFA1-NFA3 are designed, which are based on different logical expressions.Moreover, three corresponding simplified FAs, denoted as SFA1-SFA3, are also proposed and compared with NFAs and previous FAs.These FAs are then used to design 4 bit CFAs.
Reliability analysis shows that FA3 with five-input majority gate has more robust structure than FA1 and FA2, with PTM.The analysis for FAs demonstrates that SFA3 reaches better properties than other FAs in some respects, i.e. reduced by 46.4,19.4,34.6, 40.5, 60.5 and 69.1% in average energy dissipation compared with schemes in references.The SCFA3 consisted of SFA3 which also ranks first in terms of area and cell count.

Acknowledgment
This work was supported by the National Natural Science Foundation of China (no.61271122).

Fig. 3 Fig. 2
Fig. 3 QCA inverters a Two cells positioned diagonally in one clock b Adding one more cell before output

Fig. 9
Fig. 9 QCA simplified FAs a SFA1 b SFA2 c SFA3 functions.Fig.13displays the correct waveforms for proposed NCFA1-NCFA3, respectively, where each colour pair corresponds to one input/output pair.We add one more input/output pair with red colour for 4 bit NCFA2 and NCFA3 to indicate the less clock zone than NCFA1 by 0.75 clock zones.Fig.14depicts the simulation waveforms to test the correctness for SFA1-SFA3, which explains the rationality of SFAs.The waveforms of three corresponding SCFAs are shown in Fig.15.The colour pair also corresponds to one input/output pair.
-19 illustrate considerable optimisation in three energy dissipations with bar graph according to achieved results.It can be seen that SFA3 leads to almost 46.4,19.4,34.6, 40.5, 60.5 and 69.1% improvements in average energy dissipation in

Fig. 17 Fig. 18
Fig. 17Average energy dissipation of FAs in various tunnelling energy levels

Fig. 19
Fig. 19 Average switching energy dissipation of FAs in various tunnelling energy levels

Table 1
Bistable approximation parameters

Table 2
Physical properties for FAs

Table 4
Physical properties for 4 bit CFAs