BDD-Based Topology Optimization for Low-Power DTIG FinFET Circuits

This paper proposed a logic synthesis method based on binary decision diagram (BDD) representation. The proposed method is optimized for dual-threshold independent-gate (DTIG) FinFET circuits. The algorithm of the BDD-based topology optimization is stated in detail. Some kinds of feature subgraph structures of a BDD are extracted by the extraction algorithm and then fed to mapping algorithm to get a final optimized circuit based on predefined DTIG FinFET logic gates. Some MCNC benchmark circuits are tested under the proposed synthesis method by comparing with ABC, DC tools. The simulations show that the proposed synthesis method can obtain performance improvement for DTIG FinFET circuits.


Introduction
As a 3D transistor, FinFET is more efficient than the traditional devices because it can suppress short channel effect (SCE) and drain induced barrier lowering (DIBL) [1][2][3][4][5][6].FinFET can operate in common-gate (CG) mode, whose two gates are tied together naturally and can be used like a traditional single-gate device with improved performances, or in independent-gate (IG) mode, whose two gates can be used as two separated single-gate devices in parallel or series on special conditions [2].
Today's VLSI is usually designed by "standard cell" method [7].Among the design flow, synthesis is an important process, which transforms the high level design to low level netlists form composed by standard cells.Currently, standard cell libraries are basically built on the basis of CMOS or CG FinFET devices.The synthesis tools, such as commercial tools like Synopsys Design Compiler (DC), public-domain tools like ABC [8], and synthesis algorithms like factorizationbased methods, usually utilizing these single-gate standard cell libraries to optimize the circuit topology.However, according to our research, DTIG FinFET-based circuits have excellent performances and can be used in modern VLSI circuits [9][10][11].So it has an emerging need to develop a comprehensive method based on DTIG FinFETs.
As a powerful representation of logic function, binary decision diagram (BDD) is widely used in computer science to solve graph algorithms and matrix-operation, even artificial intelligence problems.It also can be applied in VLSI design to construct the topology of the circuits and detect and optimize the circuits [12][13][14].The efficiency of the BDD is always determined by the variable order of the functions.Since the variable ordering problem is a NP-complete [15,16], there are many approximation heuristic methods have been found to efficiently solve this problem [12,[17][18][19].
The paper described a circuit topology optimization method based on BDD technology to compose the DT IG FinFET circuits by using the predefined DTIG FinFET basic logic cells.It is organized as follows.In Section 2, we introduce the predefined DTIG FinFET logic gates briefly, and in Section 3 we state some theorems and an algorithm of feature modules extraction in BDD graph.The mapping algorithm is also included in Section 3.And in Section 4, we realize the algorithms and verify the effectiveness of the method by testing the MCNC benchmarks.Finally, we conclude in Section 5.

DTIG FinFET Cell Library
Compared with single-gate devices, such as CMOS or CG FinFET, DTIG FinFET can design more flexible circuits by  using of the low-threshold (low-V th ) and high-threshold (high-V th ) devices [2,3,9,11,20].We have built a mini DTIG FinFET logic cells library for further using and as shown in Figure 1 are two examples and their CG comparisons.The DTIG FinFET logic cells have more compact structures than the CG FinFET logic cells, and thus they would have more advantages in transistor count, delay, and power dissipation than CG FinFET cells, as examples shown in Table 1.All gates in the library, which are composed by low-V th transistors and high-V th ones with parameters extracted from TCAD simulations, have been verified by Hspice with the BSIMIMG model from UC Berkeley [21].

The Algorithms of Synthesis DTIG Logic Circuits
Firstly, we give some definitions about BDD and feature structure.Next, we give some theorems related to BDD subgraph extraction and prove them.Finally, we describe the implementations of BDD-based extraction algorithm and mapping algorithm.
. .Definitions.Binary decision diagrams (BDD) is proposed by Akers [22] as a method to representation of logic function.
The methods based on BDD have been widely used in the representation and design of VLSI [23].For the convenience of further discussion, we introduce several definitions related to BDD and its subgraph extraction firstly.
Definition (binary decision diagram (BDD)).BDD has been defined in [22,24] detailed; here we briefly describe some concepts.A BDD, as an example shown in Figure 2(a), is a rooted directed acyclic graph which is used to represent a Boolean function as The BDD graph can be represented in formal language as G = ⟨V, E⟩, where V and E are the node set and edge set, respectively.V contains two types of nodes, named as nonterminal nodes (circular form in the figure), and terminal nodes (block form in the figure).Every nonterminal node, labelled with an input variable V  ∈  ((i = 1, 2, . . ., n), has two children, low (V  ) ∈  and high (V  ) ∈ , and two relative edges, else (V  ) ∈  and then (V  ) ∈ , connecting to the two children, respectively.Here in this paper, the else (V  ) and then (V  ) edges are denoted as dotted line and solid line in the figures, respectively.A terminal node V  ∈  has not any child and outgoing edges and is only labelled with a value (V  ) ∈ {0, 1}.
When a BDD is used to compute a logic function (V 1 , V 2 , . . ., V n ), we recursively compute the function from the root node (here assuming that it is V  ) to the terminal node 0 or 1: where V  is the complementary logic of V  .Because the production of 0 and any logic is 0, we can omit the paths from root to 0-terminal and reserve the computations from root to 1-terminal.
The BDD mentioned in this paper actually refers to reduced ordered binary decision diagram (ROBDD) [25][26][27], which is a type of variant BDD with input variables specially ordered and simplified structure with distinct nodes.There are many algorithms to get the ROBDD from BDD by efficiently ordering the variables and removing the redundant variables in literatures [15][16][17][18][19].

Definition
(feature structure).A feature structure is a special subgraph in a BDD,   =   ⊆  ( = 1, 2, 3, . ..), which can be mapped to a DTIG FinFET basic logic gate, such as AND/NAND, OR/NOR, AOI, XOR, and MUX.Here   is a subgraph in  rooted by V  with its children and the descendants.
From Figure 2(a), we can extract some BDD subgraph, as shown in Figures 2(b)-2(f).Among them, subgraphs in Figures 2(c)-2(e) can be realized with the DTIG FinFET logic gates that are AOI, OAI, AND, and NOR, respectively.Therefore, all these structures are feature structures.For example, Figure 2(b) is an extracted subgraph from Figure 2(a) that can be realized by an AOI cell, and Figure 2(c) can be realized by an OAI, as shown in Figure 2(h).The structure in Figure 2(f) cannot be mapped to any logic cell; therefore it is not a feature structure.
On the other hand, for a nonterminal node in BDD, we can always realize by a MUX cell directly [22,28].So, we can map the BDD shown in (1) to a MUX-based netlist directly as shown in Figure 2(g).

. . eorems of Extraction Algorithm
Theorem 3.For a nonterminal node V  of a BDD, we assume that its two children are V  with node function (V  ) and V  with node function (V  ).If a nonterminal node V  with function (V  ) and one of its two outgoing edges are connecting to V  and another edge connecting to V  or V  , the nodes group (V  , V  , V  , V  ) can construct as a feature structure of AND/NAND/OR/NOR.
According to Definition 1, there is a function shown in (3) and the derivations (4) and ( 5) from Figure 3(a).
From ( 4) and ( 5), we can map the BDD subgraph to one of four feature structures of NAND, AND, OR, and NOR as shown in Figures 3(b Similarly, the other cases satisfying the condition of Theorem 3 will get similar results and also can be proven easily. Theorem 4. For a node group (V  , V  , V  , V  ) satisfying the condition of eorem , assuming that there exists a nonterminal node V  with function (V  ) and V  is a common child of V  and V  and V  is just a child of V  , when one outgoing edge of V  connects to node V  and the other edge connects to node V  , then the nodes V  , V  , V  , V  , V  and their relative edges construct a feature structure of three-input AND/NAND/OR/NOR logic.
Otherwise, when one outgoing edge of V  connects to node V  and the other edge connects to node V  or V  , the nodes V  , V  , V  , V  , V  and their relative edges construct a feature structure of three-input AOI/OAI logic.
Proof.Firstly we consider the first case as shown in Figure 4(a).Assuming that V  = high(V  ) = low(V  ) = low(V  ) and V  = low(V  ), according to (2) in Definition 1, there is (6) and its derivations as follows: From ( 7) and ( 8), we can get the results as shown in Figures 4(b)-4(e).Now we consider another case in Theorem 4; assuming that there is (9) and its derivations as follows: From ( 10) and ( 11), we can get the results of Theorem 4.
Theorem 5.For two nodes V  , V  in a BDD graph, assume that V  is the low child and the high child of V  at the same time, if the low child V  has two outgoing edges, then and else, connecting to V  and V  , respectively, while the high child V  has two outgoing edges, then and else, connecting to V  and V  , respectively.e subgraph including V  , V  , V  , V  can be construct as a feature structure of XOR/XNOR logic.
Proof.From Theorem 5, we can show it in Figure 5.
According to (2) in Definition 1, we have From ( 13), the subgraph can be constructed as XOR/XNOR logic, as shown in Figures 5(b) and 5(c).
. .BDD-Based Node Extraction Algorithm.According to the definitions and the theorems above mentioned, we have developed a feature structure extraction algorithm for a given BDD, which contains four subroutines to finish the extraction procedure.The subroutine 1 outputs the reduced ordering form of the BDD graph from an input logic function f.The extracted BDD subgraphs of the feature structures will be obtained through the processes from subroutines 2 to 4 Algorithm of Subroutine 1:  successively.In each subroutine, some parts of the BDD are marked as the different feature structures.The order of the algorithm flow is carefully arranged.As shown in Table 1, in the proposed DTIG FinFET cell library, NAND/NOR cells can reduce the number of transistors more than the single-gate gates, AOI/OAI/NAND3/NOR3 cells the second, and XOR cells the least.The extraction order of the feature structure is in this order to get the most improvement.When the whole algorithm procedure is finished, we will obtain the optimized BDD and the feature structures, which can be fed to the mapping algorithm to further process.
In the algorithm of the subroutine 1, as shown in Algorithm 1, we generate an initial reduced ordered BDD from the input logic function by using a sorting algorithm like the traditional bubble sort algorithm.When subroutine 1 is finished, an optimal ordered BDD form will be obtained.
The subroutine 2, as shown in Algorithm 2, searches for a father V  of each node V  in the optimal ordered BDD from subroutine 1.
We mark V  as a father of V  and V  , V  as two children of V  .If the node group (V  , V  , V  , V  ) meets the condition of Theorem 3, i.e., both V  and V  (or V  ) are the children of V  and we extract the subgraph containing the nodes V  , V  , V  , V  and their relative edges and then mark the subgraph as a feature structure of OR/NOR ( − ) or AND/NAND ( − ).When this subroutine is finished, all feature structures of OR/NOR or AND/NAND are extracted and stored in the sets Gso and Gsa, respectively.
The algorithm of the subroutine 3, as shown in Algorithm 3, searches for a node group (V  , V  , V  , V  , V  ) which meets the condition of Theorem 4. First we check each subgraph in the result set Gsa from subroutine 2 and the origin BDD graph set  from subroutine 1; if a node V  exists in  which satisfies the cases of Theorem 4, we extract a new subgraph containing the node V  , and the corresponding subgraph in the set Gsa and their edges, and we mark the new subgraph as a new feature structure of AND3/NAND3 or AOI.Then, we store this feature structure into the set Gsa for AND3/NAND3 or set Gsaoi for AOI, respectively.Finally, the corresponding subgraph in the set Gsa should be deleted because it has been covered by the new generated feature structure.
According to Theorem 5, algorithm of the subroutine 4, as shown in Algorithm 4, searches for the special groups of nodes in the optimal ordered BDD from subroutine 1 and then constructs them as new feature structures.If the nodes of a group satisfy the conditions of Theorem 5, (a) a node (denoted as V  ) is the same father of the other two nodes (denoted as V  and V  ) of the same variable, (b) there exist two nodes that are the same children of V  and V  , and (c) (V  ) = ℎℎ(V  ) and (V  ) = ℎℎ(V  ), then we extract the group and their edges as a feature structure of XOR logic and then store it into set Gsxor.
When these subroutines are all finished, the proposed extraction algorithm generates the optimal ordered BDD form of a given logic function and obtains all the feature structures in the BDD.
If high(vj) = vx OR high(vj) = vy then 6: Gsa[] ← f AND-NAND 13: end if 14: end if 15: end for 16: end for if low(v k )=v j OR high(v k )=v j then 5: if v k Theorem 4 case 1 then 6: Gsaoi[] ← f AOIOAI 14: end if 15: end if 16: end for 17: end for . .e Mapping Algorithm.The extraction algorithms described above only aims at simplifying Boolean functions.In the next logic synthesis steps, we need to replace the logic gates with physical cells in the cell library by using a mapping algorithm.We propose the feature structure mapping algorithm with four steps.In step 1, after reading the BDD and its subgraphs from the extraction algorithm, we exclude the redundant, or covered, subgraphs.Then, in step 2, we map the source BDD to a circuit composed of MUXs completely and map the feature structures to the IG FinFET logic gates.In step 3, we give the final optimized circuit by replacing some MUXs subcircuits with logic cells.

Algorithm Implementation
The synthesis algorithms including extraction and mapping are both implemented in MATLAB platform, and for comparison of the circuit optimization effectiveness, the ABC and DC synthesis tools are also applied to the same circuits.for each v i in G[] do 5: Gsxor library we built in Section 2. Finally, all the circuits from ABC, DC, and the proposed method are simulated by using Hspice with the BSIMIMG model from UC Berkeley [21].Figure 6 presents the simulation results of the MCNC benchmark circuits.
As shown in Figure 6(a), for almost all tested circuits, the count of transistors in the circuit optimized by this work is the least among the comparisons.Therefore, in most cases, the proposed method can get the most effective area since the area occupation is determined by the number of the transistors.
The average power dissipation can be expressed by where  V is the average power of a circuit, N is the transistor count in a circuit, P av is the average power of one transistor, P low and P high are the average power of one low-V th and high-V th IG FinFET, respectively, and p low and p high are the possibility of low-V th and high-V th IG FinFET in a circuit, respectively.From ( 14), if the possibility of transistors in kinds of circuits is close to each other, the average power of a circuit is determined by the count of transistor in a circuit.From Figure 6(b), we can see the trend of the power dissipation is close to the transistor count trend shown in Figure 6 and still almost all the circuits synthesized by this work have the least power dissipation, and the fact fits the prediction of ( 14) very well.The delay analysis is more complicated than power dissipation.The maximum delay of a circuit depends on the critical path.The more transistors on the critical path are, the greater delay should be.For a DTIG FinFET circuit, the high-threshold device has more delay because it has low oncurrent, which further reduces the switch speed and increases the delay.So as shown in Figure 6(c), we find that the circuit synthesized by this work has no obvious advantage in terms of delay.
The power delay product (PDP) can evaluate a circuit more comprehensively because it considers both delay and power consumption.As seen from Figure 6(d), compared with ABC and DC, all of the circuits synthesized by this work have obvious advantage of PDP.

Conclusion
In this paper, we have presented a BDD-based synthesis method to optimize DTIG FinFET circuits.We search the BDD graph of an input logic to find feature structures and map them to DTIG FinFET basic logic gates.The algorithms are implemented in MATLAB and compared with ABC and DC by simulation of the MCNC benchmark circuits.The result shows that the proposed method can significantly improve the performances of DTIG FinFET circuits in area occupation, power consumption, and PDP.

Figure 3 :
Figure 3: One case of Theorem 3 and relative feature structures.(a) One case of BDD subgraph, (b) NAND structure, (c) AND structure, (d) OR structure, and (e) NOR structure.

Algorithm 2 :
The algorithm of subroutine 2 in the BDD-based extraction algorithm.Algorithm of Subroutine3: Input: Gsa: set of AND/NAND2 sub-graph from subroutine2; Input: G[]: node set of BDD from subroutine 1 Output: Gsaoi[]: set of AOI/OAI sub-graph Output: Gsand3[]: set of AND/NAND31 sub-graph 1: for each gs in Gsa do 2: v j ←root(gs) 3: for each v k in BDD do 4:

Algorithm 3 :
The algorithm of subroutine 3 in the BDD-based extraction algorithm.
Step 1 to step 3 are shown as examples in Figures 2(a)-2(h) and the result is as shown in Figure 2(i).

1 : 2 :
For a fair comparison, all methods use the same DTIG FinFET cell Algorithm of Subroutine 4 Input: G[]: node set of source BDD from subroutine 1 Output: Gsxor[]: set of XOR/NXOR sub-graph 3: for each v k in G[] do 4:

Figure 6 :
Figure 6: Comparison between the simulation results of ABC, DC, and this work.
61671259 and Zhejiang Provincial Natural Science Foundation (No. LY19F010005).The study is funded by National Natural Science Foundation of China [61671259] and Zhejiang Provincial Natural Science Foundation [LY19F010005].

Table 1 :
Performance comparison of some example cells.
Algorithm 1: The algorithm of subroutine 1 in the BDD-based extraction algorithm.