skip to main content
research-article
Open Access

Construction of All Multilayer Monolithic RSMTs and Its Application to Monolithic 3D IC Routing

Published:18 December 2023Publication History

Skip Abstract Section

Abstract

Monolithic three-dimensional (M3D) integration allows ultra-thin silicon tier stacking in a single package. The high-density stacking is acquiring interest and is becoming more popular for smaller footprint areas, shorter wirelength, higher performance, and lower power consumption than the conventional planar fabrication technologies. The physical design of M3D integrated circuits requires several design steps, such as three-dimensional (3D) placement, 3D clock-tree synthesis, 3D routing, and 3D optimization. Among these, 3D routing is significantly time consuming due to countless routing blockages. Therefore, 3D routers proposed in the literature insert monolithic interlayer vias (MIVs) and perform tier-by-tier routing in two substeps. In this article, we propose an algorithm to build a routing topology database (DB) used to construct all multilayer monolithic rectilinear Steiner minimum trees on the 3D Hanan grid. To demonstrate the effectiveness of the DB in various applications, we use the DB to construct timing-driven 3D routing topologies and perform congestion-aware global routing on 3D designs. We anticipate that the algorithm and the DB will help 3D routers reduce the runtime of the MIV insertion step and improve the quality of the 3D routing.

Skip 1INTRODUCTION Section

1 INTRODUCTION

Monolithic Three-Dimensional (M3D) integration stacks very thin silicon tiers and electrically connects transistors in different tiers by Monolithic Interlayer Vias (MIVs). Unlike through-silicon vias, MIVs are tiny in width, shorter in vertical length (z-directional), and presumed to have insignificant parasitic Resistance and Capacitance (RC). The dimensions of MIVs are even smaller than those of top-level interlayer vias and comparable with those of lower-level interlayer vias (even smaller than 100 nm in diameter). As a result, the use of MIVs is considered to have almost negligible area and capacitance overheads in the M3D Integrated Circuit (IC) layout design. Moreover, M3D ICs are favorable choices in terms of power, performance, and area for advanced technology nodes and future transistor architectures over their through-silicon-via-based counterparts [22]. In addition, designing vertical processors using M3D showed power, performance, and thermal efficiency [3]. However, profusely inserting MIVs into an M3D IC layout increases routing congestion since planar wires are connected to the MIVs, and routing of the planar wires requires much larger area than the MIV area. Therefore, M3D IC layout design generally tries to minimize the number of MIVs inserted into a layout [1, 11, 20], which necessitates the Three-Dimensional (3D) routing to reduce the number of MIVs and routing congestion.

Algorithms for 3D routing could route both Two-Dimensional (2D) and 3D nets of a design separately or concurrently.1 For example, the routing methodology used in previous work [4] routes 3D nets first and routes 2D nets after that. On the contrary, Panth et al. [19] route 2D and 3D nets simultaneously by using the modified library files into a commercial tool. However, the latter has some drawbacks compared to the former. According to our routing simulations using modified library files, the runtime of the simultaneous routing of 2D and 3D nets increases significantly as the complexity (the average net degree, the net counts, the number of tiers, the number of instances, and most importantly the number of routing blockages representing MIVs) of a design goes up. On the contrary, if the 3D nets are routed first, 2D nets can be routed separately in each tier (routing tier by tier or simultaneously but separately in each tier). Thus, the 3D-net-first routing methodology has been used extensively in the literature [4, 6, 20, 21].

Figure 1 illustrates the 3D-net-first routing methodology that finds MIV locations for each 3D net first, then inserts MIVs into the locations and decomposes the 3D net into multiple 2D nets, and finally routes all the 2D nets separately in each tier. In Figure 1(a), eight pins are spread out in two tiers with one 3D net connecting all four a pins and two 2D nets connecting two b and two c pins. In Figure 1(b), a 3D routing topology using two z-directional edges is constructed for the 3D net. The z-directional edges are replaced by MIVs, and the 3D net is decomposed into three 2D nets, \(n_1\), \(n_2\), and \(n_3\) in Figure 1(c). Decomposing the 3D net into three 2D nets and the MIV locations, the 3D net routing is converted into the routing of three 2D nets, two in the bottom tier with one in the top tier. Finally, the 2D nets are routed in each tier in Figure 1(d).

Fig. 1.

Fig. 1. 3D-net-first routing. (a) Three nets to route. (b) 3D routing topology generation for the 3D net. (c) MIV insertion. (d) Tier-by-tier routing.

As mentioned earlier, 3D routing should minimize the number of MIVs used to route 3D nets and evenly distribute the MIVs as well as the planar wires over the entire layout area for routing congestion minimization. The MIV insertion methodologies used in the literature, however, do not control the MIV count, MIV locations, and planar wires of 3D nets effectively. For example, the 3D Rectilinear Steiner Tree (RST) algorithms used in other works [5, 6, 20, 21] do not guarantee the minimization of the MIV count. The MIV insertion algorithm used in previous work [4] minimizes the MIV count but fails to minimize the planar wirelength. Multilayer Obstacle-Avoiding Rectilinear Steiner Tree (MLOARST) construction algorithms can minimize both the MIV count and the planar wirelength [8, 14]. However, they do not generate multiple routing topologies that have different MIV locations and planar wire distributions.

In our preliminary work [10], we proposed an algorithm to build a routing topology Database (DB) for the construction of All Multilayer Monolithic Rectilinear Steiner Minimum Trees (AMM-RSMTs) on the 3D Hanan grid for a set of pin locations for up to six-pin nets and four tiers. Multilayer Monolithic Rectilinear Steiner Minimum Trees (MMRSMTs) have the shortest planar wirelength with the minimum number of MIVs, so MIV insertion algorithms can use the DB to effectively optimize MIV locations and planar wires of 3D nets. We also proposed DB size reduction techniques for practical use of the DB in previous work [10]. This article is an extension of that previous work [10]. In this article, we include the algorithm to construct all the 3D Potentially Optimal Steiner Trees (POSTs) and show the generation and techniques to reduce the size of the DB from our previous work [10]. We present two applications using the DB: timing-driven 3D routing topology generation and congestion-aware 3D global routing. We also propose a 3D optimal net-breaking technique for the congestion-aware 3D global routing. Our core contributions in this article are listed as follows:

We apply the AMM-RSMT DB from the previous work to construct timing-driven MMRSMTs considering several objectives and compare the outcomes with a FLUTE-like Brute-Force (BF) approach.

In addition, we apply the AMM-RSMT DB (ARD) to congestion-aware global routing of 3D designs aiming to minimize the total overflow of both planar and 3D global routing edges.

We propose a hybrid 3D net-breaking technique for the higher-degree nets and introduce an optimal 3D net-breaking technique as a part of that for the congestion-aware 3D global routing.

We also present Position Sequence (PS) algebra to aid readers in applying congruent rules to generate 2D and 3D POSTs at the very end.

The rest of this article is organized as follows. In Section 2, we discuss the terminologies used in this work, review the Rectilinear Steiner Minimum Tree (RSMT) construction of FLUTE and the necessity of generating POSTs in the 2D Hanan grid, and discuss the concept of MMRSMT. In Sections 3 and 4, we present the algorithm to construct the ARD and show the outcomes obtained from the construction of all the 3D POSTs in the 3D Hanan grid, and details of the DB generation and size reduction from previous work [10]. Sections 5 and 6 demonstrate the applications of our ARD to timing-driven 3D routing topology generation and congestion-aware global routing of 3D designs, respectively, and compare a FLUTE-like BF approach with ours showing the detailed results for several two-, three-, and four-tier 3D designs. Finally, we summarize and conclude in Section 7.

Skip 2PRELIMINARIES Section

2 PRELIMINARIES

In this section, we explain terminologies used in this article, review two papers on the construction of RSMTs [2, 9], and formulate the problem we solve in this work.

2.1 Terminologies

2.1.1 2D Hanan Grid.

Suppose a finite set S of points is given in the 2D plane. Let \(X_S = \lbrace x_1,\ldots ,x_L\rbrace ~(x_1 \le \cdots \le x_L)\) and \(Y_S = \lbrace y_1,\ldots ,y_M\rbrace ~(y_1 \le \cdots \le y_M)\) be the sets of the x- and y-coordinates of the points in S, respectively. Then, the 2D Hanan grid constructed for S is a graph \(G_S=(V_S,E_S),\) where \(V_S\) and \(E_S\) are defined as follows: \(\begin{eqnarray*} V_S=\lbrace (x, y) | x \in X_S, y \in Y_S\rbrace , E_S = E_{S,X} \cup E_{S,Y}, \\ E_{S,X} = \lbrace (v_1,v_2) | v_1=(x_i,y_j) \in V_S, v_2=(x_{i+1},y_j) \in V_S\rbrace , \\ E_{S,Y} = \lbrace (v_1,v_2) | v_1=(x_i,y_j) \in V_S, v_2=(x_i,y_{j+1}) \in V_S\rbrace . \end{eqnarray*}\) Figure 2(a) shows the 2D Hanan grid constructed for the five points, \(\lbrace p_1,\ldots , p_5\rbrace\).

Fig. 2.

Fig. 2. 2D and 3D Hanan grids.

2.1.2 3D Hanan Grid.

Suppose a finite set T of points is given in the 3D space. Let \(X_T=\lbrace x_1,\ldots ,x_L\rbrace ~(x_1 \le \cdots \le x_L)\), \(Y_T=\lbrace y_1,\ldots ,y_M\rbrace ~(y_1 \le \cdots \le y_M)\), and \(Z_T=\lbrace z_1,\ldots ,z_N\rbrace ~(z_1 \le \cdots \le z_N)\) be the sets of the x-, y-, and z-coordinates of the points in T, respectively. Then, the 3D Hanan grid constructed for T is a graph \(G_T=(V_T,E_T),\) where \(V_T\) and \(E_T\) are defined as follows: \(\begin{eqnarray*} V_T=\lbrace (x, y, z) | x \in X_T, y \in Y_T, z \in Z_T\rbrace , E_T = E_{T,X} \cup E_{T,Y} \cup E_{T,Z}, \nonumber \nonumber\\ E_{T,X} = \lbrace (v_1,v_2) | v_1=(x_i,y_j,z_k) \in V_T, v_2=(x_{i+1},y_j,z_k) \in V_T\rbrace , \nonumber \nonumber\\ E_{T,Y} = \lbrace (v_1,v_2) | v_1=(x_i,y_j,z_k) \in V_T, v_2=(x_i,y_{j+1},z_k) \in V_T\rbrace , \nonumber \nonumber\\ E_{T,Z} = \lbrace (v_1,v_2) | v_1=(x_i,y_j,z_k) \in V_T, v_2=(x_i,y_j,z_{k+1}) \in V_T\rbrace . \nonumber \nonumber \end{eqnarray*}\) Figure 2(b) shows the 3D Hanan grid constructed for the five points, \(\lbrace p_6,\ldots , p_{10}\rbrace\).

2.1.3 Position Sequence.

\(x_+\)- and \(x_-\)-directions are the directions along which x-coordinates increase and decrease, respectively. \(y_{\pm }\)- and \(z_{\pm }\)-directions are defined similarly.

Suppose a finite set \(P=\lbrace p_1,\ldots ,p_n\rbrace\) of n distinct pins2 is given. Let the x-coordinates of the y-directional edges of the 2D Hanan grid \(G_P\) be \(x_1\) to \(x_n\) from the left and the y-coordinates of the x-directional edges of \(G_P\) be \(y_1\) to \(y_n\) from the bottom as shown in Figure 2(a). Then, we denote sorting the pins in the increasing and decreasing order of their c-coordinates (c is x or y) by \(c_+\) and \(c_-\), respectively. In Figure 2(a), for example, \(y_+\) sorting leads to the ordered list \(L_1=(p_3, p_1, p_5, p_4, p_2)\).

Suppose we obtain an ordered list \(L = (l_1,\ldots , l_n)\) from \(c_+\) or \(c_-\) sorting. Then, we can obtain the indices of the \(\bar{c}\)-coordinates (if c is x (or y), \(\bar{c}\) is y (or x)) of the pins in the \(\bar{c}_+\)- or \(\bar{c}_-\)-direction from L. For example, we obtain \((3 1 5 4 2)\) and \((3 5 1 2 4)\) if we extract the x-coordinates of the pins in \(L_1\) in the \(x_+\) and \(x_-\) directions, respectively. The PS \(\Gamma _{(s,r)}\) for P is a sequence \((k_1 k_2 ... k_n)\) where \(s \in \lbrace c_+,c_-\rbrace\), \(r \in \lbrace \bar{c}_+, \bar{c}_-\rbrace\), and \(k_i\) is the index of the \(\bar{c}\)-coordinate of the i-th pin in the r-direction in the list of the pins sorted by s sorting. For example, assume that \((s,r)\) is \((y_+,x_+)\) and P is the set of pins in Figure 2(a). Then, we first sort the pins along the \(y_+\)-direction, which leads to the ordered list \((p_3,p_1,p_5,p_4,p_2)\), and obtain \(\Gamma _{(y_+,x_+)} = (3 1 5 4 2)\). Similarly, \(\Gamma _{(y_+,x_-)}\) is \((3 5 1 2 4)\), \(\Gamma _{(y_-,x_+)}\) is \((2 4 5 1 3)\), \(\Gamma _{(y_-,x_-)}\) is \((4 2 1 5 3)\), \(\Gamma _{(x_+,y_+)}\) is \((2 5 1 4 3)\), \(\Gamma _{(x_+,y_-)}\) is \((4 1 5 2 3)\), \(\Gamma _{(x_-,y_+)}\) is \((3 4 1 5 2)\), and \(\Gamma _{(x_-,y_-)}\) is \((3 2 5 1 4)\).

2.1.4 Potentially Optimal Wirelength Vector and POST.

The wirelength of an RST on the 2D Hanan grid can be expressed as a linear combination of the x- and y-directional edge vectors representing the tree as explained in the work of Chu and Wong [2]. For example, the wirelength of the tree in Figure 3(a) is (1) \(\begin{equation} L = 1 \cdot h_1 + 2 \cdot h_2 + 2 \cdot h_3 + 1 \cdot h_4 + 1 \cdot v_1 + 1 \cdot v_2 + 1 \cdot v_3 + 1 \cdot v_4, \end{equation}\) which can also be expressed as \(L = C \cdot E,\) where \(C = (1,2,2,1,1,1,1,1)\) and \(E = (h_1,h_2,h_3,h_4,v_1,v_2,v_3,v_4)\). C is called a coefficient vector and E is called an edge length vector. The edge length vector is a constant vector for given pins. However, the coefficient vector is dependent on the RST. For example, the coefficient vector for the tree in Figure 3(b) is \((1,2,1,1,1,1,2,1)\). Thus, the two trees in Figure 3 have the same edge length vector but different coefficient vectors.

Fig. 3.

Fig. 3. Two RSTs constructed on the 2D Hanan grid.

For given pin locations, a coefficient vector \(C=(c_1,\ldots , c_k)\) becomes a Potentially Optimal Wirelength Vector (POWV) if it satisfies the following conditions [2]:

There exists an RST that connects all the pins and uses the edges specified in the coefficient vector C on the Hanan grid constructed for the pins.

There is no other coefficient vector \(C^{\prime }=(c_1^{\prime },\ldots , c_k^{\prime })\) satisfying \(c_i^{\prime } \le c_i\) for all \(i=1,\ldots ,k\).

An RST corresponding to a POWV is called a potentially optimal Steiner tree (POST) [2]. The two RSTs shown in Figure 3(a) and (b) are POSTs.

2.2 Construction of All RSMTs on the Hanan Grid

FLUTE constructs an RSMT by a lookup table [2]. The lookup table consists of all PSs, all POWVs belonging to each PS, and one POST for each POWV. Whenever a set of pin locations is given, FLUTE first finds the PS of the pin locations, compares the wirelengths of all the POWVs belonging to the PS, and returns the POST of the POWV having the minimum wirelength. If multiple POWVs have the same wirelength, FLUTE can return their POSTs. The returned POSTs are RSMTs for the pin locations. However, FLUTE finds only one POST for each POWV, although a POWV can have multiple POSTs. Thus, Lin generated a DB (called ARSMT DB) storing all POSTs for each POWV in previous work [9, 12]. Figure 4 shows an overview of the ARSMT DB. The algorithm finding all POSTs for each POWV uses a binary decision tree with several speedup techniques to reduce the DB construction time.

Fig. 4.

Fig. 4. An overview of the lookup table of all POSTs in other works [9, 12].

2.3 Multilayer Monolithic Rectilinear Steiner Minimum Trees

We first define three terminologies.

Definition 1.

A 3D rectilinear tree is a tree having only x-, y-, and z-directional edges and connecting all given pins.

Definition 2.

A 3D rectilinear Steiner tree (3D RST) is a 3D rectilinear tree with Steiner points. A Steiner point is a nonpin vertex with more than two edges.

Definition 3.

A 3D rectilinear Steiner minimum tree (3D RSMT) is a 3D RST having the minimum wirelength.

The wirelength of a 3D rectilinear tree is computed by the sum of the lengths of all the edges in the tree. In the M3D IC layout design, however, the area and capacitance overhead of an MIV is negligible, so we can set the length of an MIV (the length of a z-directional edge) to zero during 3D routing. However, minimizing the number of MIVs inserted in the layout is still crucial. Thus, we define an MMRSMT as follows.

Definition 4.

A multilayer monolithic rectilinear Steiner minimum tree (MMRSMT) is a 3D RST satisfying the following:

Its planar wirelength is equal to the wirelength of a 2D RSMT constructed for the pins projected onto the 2D plane.

The number of z-directional edges is minimal.

If we project all the edges in an MMRSMT onto the xy plane, the projection becomes a 2D RSMT. Thus, an MMRSMT can be constructed from a 2D RSMT by properly placing the x- and y-directional edges of the 2D RSMT in a 3D grid and inserting z-directional edges. In addition, we obtain 2D RSMTs from POSTs as mentioned in the previous section. Thus, we can construct an MMRSMT from a POST. We define a 3D POST as follows.

Definition 5.

Suppose a set of xy-distinct3 pin locations is given. Let the set be \(P = \lbrace (x_1,y_1,z_1),\ldots ,(x_n,y_n,z_n)\rbrace\). Let the set of the projections of the pins onto the xy plane be \(P^{\prime }=\lbrace (x_1,y_1),\ldots ,(x_n,y_n)\rbrace\). Let a POST constructed for \(P^{\prime }\) be \(G^{\prime }=(V^{\prime },E^{\prime })\). Let the coordinate of \(e^{\prime }\) in \(E^{\prime }\) be \(e^{\prime }(i,j)\). A 3D potentially optimal Steiner tree (3D POST) is a tree T that connects all the pins in P, uses the minimum number of z-directional edges in the 3D Hanan grid \(G=(V,E)\) constructed from P, and uses one of the edges among \(e(i,j,k=0,\ldots ,t-1) \in E\) for each \(e^{\prime }(i,j) \in E^{\prime }\). t in the definition is the number of tiers. From now on, we denote the POSTs in the ARSMT DB as 2D POSTs to distinguish them from 3D POSTs.

In summary, if we have a DB of all 3D POSTs, we can build all MMRSMTs for given pin locations quickly. In this work, therefore, we build a DB of all 3D POSTs for all possible relative pin locations in two, three, and four tiers. Note that in the case of pins without unique x- or y-coordinates, we can assume they have distinctive coordinates by slightly adjusting their locations to the left/right (for x-coordinates) or up/down (for y-coordinates). Based on that, we generate PSs from their relative positions knowing that the lengths of the newly evolved edges (tiny extensions) are zeros. This would eventually revert the distinct coordinates to their actual nondistinct coordinates.

Skip 3CONSTRUCTION OF ALL 3D POSTS Section

3 CONSTRUCTION OF ALL 3D POSTS

In this section, we present an algorithm to construct all 3D POSTs on the 3D Hanan grid. Figure 5 shows a 3D grid, pin and nonpin vertices, x-, y-, and z-directional edges, and notations used in this article.

Fig. 5.

Fig. 5. An \(n \times n \times t\) 3D grid, pin and nonpin vertices, and indices for x-, y-, and z-directional edges.

3.1 Construction of All 3D POSTs

The input to the algorithm is a set P of pin locations and a 2D POST, \(G_2 = (V_2, E_2)\), constructed from the projection of the pins onto the xy plane. For example, Figure 6(a) shows three pins in a 3D grid, the projection of the pins, and a 2D POST for them.

Fig. 6.

Fig. 6. Construction of all 3D POSTs in three tiers for pins \((0,0,0)\) , \((1,2,2)\) , \((2,1,1)\) . (a) A 2D POST is given. (b) The construct_3D_grid function creates a 3D grid structure. (c)–(n) 3D POST construction. The red edges are used planar edges and the blue edges are used z-directional edges.

Algorithm 1 shows the proposed algorithm for constructing all 3D POSTs. We first set the visited variables of all the edges in \(E_2\) to false (line 1). Then, we sort the edges and store the result in an ordered set \(E_2^{\prime }\) (line 2). The sort_edges function chooses a pin vertex in \(G_2\) and performs the breadth-first search starting from the vertex until all the pin vertices are reached. Whenever it goes through an edge, the function inserts the edge into \(E_2^{\prime }\). This order reduces the runtime of the algorithm. For example, the sort_edges function starts from the pin vertex \(p_1\) in Figure 6(a). Then, \(E_2^{\prime }\) becomes \((e_x(0,0), e_y(0,1), e_x(1,1), e_y(1,1))\). Then, we construct a 3D grid \(G_3=(V_3,E_3)\) from \(G_2\) and P (line 3). The construct_3D_grid function expands \(G_2\) to \(G_3\) as shown in Figure 6(b). Then, we set the used variables of all the edges in \(E_3\) to false (line 4). T is a set of graphs storing all the 3D POSTs, nr_MIVs is a variable storing the number of MIVs used in \(G_3\), and min_nr_MIVs is a variable storing the minimum number of MIVs used in the 3D POSTs (line 5). Then, we call the recur_con function to recursively construct all 3D POSTs (line 6). Once the algorithm ends, we return T (line 7).

The recur_con function starts from checking the given index, which is used to access the edges in \(E_2^{\prime }\). The edge index is greater than the number of edges in \(E_2^{\prime }\) when there is no more edge to process in \(G_3\) (line 1), which means that \(G_3\) is a 3D graph connecting all the pins. In this case, if the total number of MIVs used in \(G_3\) is equal to the minimum number of MIVs used in the best graphs found until now, we add it to T (line 3). However, if the total number of MIVs used in \(G_3\) is less than the minimum number of MIVs used in the best graphs found until now, all the graphs in T use more MIVs than \(G_3\), so we empty T, add \(G_3\) to T, and update min_nr_MIVs (line 5).

If the edge index is less than the size of \(E_2^{\prime }\) (line 9), we visit the edge in \(E_2^{\prime }\) indexed by the edge index variable (line 10) and try using edges in \(G_3\) corresponding to the indexed edge (lines 11–31). First, suppose \(e_d^{\prime }(i,j)\) is \(E_2^{\prime }\)[index], where d is either x or y. Then, we try using \(e_d(i, j, k)\) in \(G_3\) for each \(k=0,\ldots ,t-1\) (line 13). In Figure 6(c), for example, we try using \(e_x(0,0,0)\) in \(G_3\). Then, if e is x-directional, we obtain its left vertex in \(G_2\), and otherwise we obtain its bottom vertex in \(G_2\) and assign it to v (line 14). Then, we find the bottommost and topmost tiers that should be connected along the z-axis through v in \(G_3\) by the get_min_max_tier function (lines 15–18). The function finds all the visited edges connected to e in \(G_2\), obtains the tiers of the edges in \(G_3\) corresponding to the visited edges, and finds the bottommost and topmost tiers. In addition, if v is a pin vertex, its z-coordinate should be included in the computation of the range of the tiers. We repeat the same process for the right vertex of e (or the top vertex if e is y-directional) (lines 19–23).

If we have visited all the edges connected to the left and right (or the bottom and top) vertices of e, we can find the z-directional edges required to connect to the pin and the edges along the z-axis at the vertices. From the z-directional edges, we obtain the number of MIVs (line 24). If the total number of MIVs currently used in \(G_3\) is less than or equal to the minimum number of MIVs used in the best graphs found until now, we move on to the next edge in \(E_2^{\prime }\) (line 27). Otherwise, the current graph uses more MIVs than the best graphs, so we do not need to proceed to the next edge. Once the recursive function call ends (line 28), we re-adjust the number of MIVs used in \(G_3\) (line 29) and try using the edge above e (lines 30 and 31).

In Figure 6(c), for example, \(e_x(0,0,0)\) is in Tier 0 and its left vertex is a pin vertex, so both the bottommost and topmost tiers for the vertex are Tier 0. Then, we move on to \(e_y(0,1)\) in \(E_2^{\prime }\) and try using \(e_y(0,1,0)\) in \(G_3\) in Figure 6(d). The used variables of all the edges connected to the bottom vertex of \(e_y(0,1)\) are true at this point and all the edges are placed in Tier 0. Thus, we do not need to add any z-directional edges above the vertex. Then, we process the next edge \(e_x(1,1)\) in Figure 6(e). The right vertex of \(e_x(1,1)\) is connected to the pin located at \((2,1)\) in \(G_2\), which corresponds to the pin located at \((2,1,1)\) in \(G_3\), so the bottommost and topmost tiers at the vertex are Tier 0 and Tier 1, respectively. Thus, we use \(e_z(2,1,0)\) in \(G_3\), which is inserting an MIV into the location. When we also try using \(e_y(1,1,0)\) in Figure 6(f), we finally construct a 3D graph connecting all the pins and the total number of MIVs is 3. Similarly, the total numbers of MIVs in the 3D graphs in Figure 6(g) through (i) are all 3. However, the 3D graph in Figure 6(j) uses two MIVs. At this time, T contains all the 3D graphs found in Figure 6(f) through (i), so we delete all of them from T and add the 3D graph found in Figure 6(j) to T. There are four more 3D graphs using two MIVs as shown in Figure 6(k) through (n). Thus, when the algorithm finishes, T contains all the five 3D graphs, which become 3D POSTs for the given pin locations and 2D POST.

3.2 Congruence of 3D POSTs

The runtime of the algorithm shown in Algorithm 1 is still long and there are numerous 3D POSTs in the DB, so it is crucial to reduce the runtime and the DB size. In this section, we show congruent properties of the 3D POSTs, which are used to skip generating and storing some 3D POSTs.

3.2.1 Congruence of PSs.

As mentioned in the work of Chu and Wong [2], two PSs are congruent if rotating one of them leads to the other. For example, Figure 7(a) shows the PS \((3 1 5 4 2)\). If we rotate it counterclockwise by 90, 180, and 270 degrees, we obtain PSs \((4 1 5 2 3)\), \((4 2 1 5 3)\), and \((3 4 1 5 2)\) as shown in Figure 7(b), (c), and (d), respectively. If two PSs are congruent to each other, the POSTs constructed for one of them can be used for the other. Thus, we do not need to generate 2D POSTs for some PSs. In addition to the rotation, reflection also generates congruent PSs. Reflecting the pin locations in Figure 7(a) over the y-directional line results in PS \((3 5 1 2 4)\) shown in Figure 7(e). Now, rotating the PS counterclockwise by 90, 180, and 270 degrees leads to PSs \((3 2 5 1 4)\), \((2 4 5 1 3)\), and \((2 5 1 4 3)\) shown in Figure 7(f), (g), and (h), respectively.

Fig. 7.

Fig. 7. Congruence of eight PSs.

Rotating and reflecting a PS has the same effect as generating PSs by \(\Gamma _{(s,r)}\). For example, generating the PS in Figure 7(a) is the same as generating the PS \(\Gamma _{(y_+,x_+)}\). The PSs obtained by rotating the PS \(\Gamma _{(y_+,x_+)}\) by 90, 180, and 270 degrees are the same as the PSs \(\Gamma _{(x_+,y_-)}\), \(\Gamma _{(y_-,x_-)}\), and \(\Gamma _{(x_-,y_+)}\), respectively. Similarly, reflecting \(\Gamma _{(y_+,x_+)}\) over the y-directional line is the same as generating the PS \(\Gamma _{(y_+,x_-)}\). Then, rotating \(\Gamma _{(y_+,x_-)}\) by 90, 180, and 270 degrees is the same as obtaining the PSs \(\Gamma _{(x_-,y_-)}\), \(\Gamma _{(y_-,x_+)}\), and \(\Gamma _{(x_+,y_+)}\), respectively. Appendix A shows operations defined for PSs, their properties, and a table for the congruence rules.

If multiple PSs are congruent, we store POSTs for only one (called a base position sequence) of them. Then, we can obtain POSTs for the other PSs by properly transforming the POSTs stored for their base PS. We use the following rule to determine base PSs. When pin locations are given, we find eight PSs \(\Gamma _{(y_\pm ,x_\pm)}\) and \(\Gamma _{(x_\pm ,y_\pm)}\) for them and choose the smallest PS for its base PS. In Figure 7, for example, \((2 4 5 1 3)\) in Figure 7(g) is the smallest number, so \((2 4 5 1 3)\) becomes the base PS for all the PSs in Figure 7.

3.2.2 Congruence of 3D POSTs.

Suppose pin locations are given in the 3D space. Then, we can characterize the pin locations by two sequences: a PS and a Tier Sequence (TS). The PS is based on the projection of the pins onto the xy plane, and the TS is based on the z-coordinates of the pins. Figure 8 shows an example. In Figure 8(a), the z-coordinates of the pins corresponding to the PS elements 3, 1, 5, 4, 2 are 0, 1, 0, 1, 0, respectively. Thus, the TS for the pin locations is \((0 1 0 1 0)\).

Fig. 8.

Fig. 8. Congruence of 16 PSs and TSs.

If we rotate the two tiers in Figure 8(a) counterclockwise by 90, 180, and 270 degrees around the z-axis, we obtain the PSs and TSs shown in Figure 8(b), (c), and (d), respectively. In addition, if we reflect the two tiers in Figure 8(a) over the yz plane, we obtain the PS and TS in Figure 8(e). Rotating the two tiers in Figure 8(e) counterclockwise by 90, 180, and 270 degrees around the z-axis leads to the PSs and TSs in Figure 8(f), (g), and (h), respectively. Moreover, reflecting the two tiers in Figure 8(a) and (e) over the xy plane generates the PSs and TSs in Figure 8(i) and (m), respectively. Rotating them counterclockwise by 90, 180, and 270 degrees around the z-axis generates the PSs and TSs in Figure 8(j), (k), and (l), and Figure 8(n), (o), and (p), respectively.

To find a congruence between two sets of PSs and TSs, we define a 3D position sequence \(\Lambda _{(s,r,w)}\), which consists of a pair of sequences. The first sequence is the 2D PS \((a_1 ... a_n)\) obtained from \(\Gamma _{(s,r)}\). The second sequence is the TS along the w-direction (\(w \in \lbrace z_+,z_-\rbrace\)) as defined previously. Then, the 3D PS for the pins in Figure 8(a) is denoted by \(\Lambda _{(y_+,x_+,z_+)}\). Similarly, 3D PSs for the pins in Figure 8(b), (c), (d), (e), (f), (g), and (h) are \(\Lambda _{(x_+,y_-,z_+)}\), \(\Lambda _{(y_-,x_-,z_+)}\), \(\Lambda _{(x_-,y_+,z_+)}\), \(\Lambda _{(y_+,x_-,z_+)}\), \(\Lambda _{(x_-,y_-,z_+)}\), \(\Lambda _{(y_-,x_+,z_+)}\), and \(\Lambda _{(x_+,y_+,z_+)}\), respectively. Since the reflection over the xy plane reverses the TS, 3D PSs for the pins in Figure 8(i), (j), (k), (l), (m), (n), (o), and (p) are \(\Lambda _{(y_+,x_+,z_-)}\), \(\Lambda _{(x_+,y_-,z_-)}\), \(\Lambda _{(y_-,x_-,z_-)}\), \(\Lambda _{(x_-,y_+,z_-)}\), \(\Lambda _{(y_+,x_-,z_-)}\), \(\Lambda _{(x_-,y_-,z_-)}\), \(\Lambda _{(y_-,x_+,z_-)}\), and \(\Lambda _{(x_+,y_+,z_-)}\), respectively. If two sets of pin locations are congruent, we can use the 3D POSTs belonging to one of them for the other by properly transforming the 3D POSTs.

We also define a 3D base position sequence as follows. Suppose a set of pin locations is given in the 3D space. Then, we find all the 16 3D PSs \(\Lambda _{(y_\pm , x_\pm , z_\pm)}\) and \(\Lambda _{(x_\pm , y_\pm , z_\pm)}\) for them and choose the smallest 3D PS for their 3D base PS. If multiple 3D PSs have the same 2D PS, the one with the smallest TS becomes the 3D base PS for them. In Figure 8, for example, the smallest 2D PS is \((2 4 5 1 3)\) in Figure 8(g) and (o). Between these two, the TS \((0 1 0 1 0)\) is smaller than the TS \((1 0 1 0 1)\), so the 3D PS of Figure 8(g) becomes the 3D base PS for all the 3D PSs in Figure 8. Appendix A also shows operations defined for 3D PSs, their properties, and a table for the congruence rules.

Skip 4DB GENERATION Section

4 DB GENERATION

In this section, we present simulation results obtained from the construction of all 3D POSTs on the 3D Hanan grid. We implemented the proposed algorithm using C/C++ and ran the code in an Intel Core i5-6600K 3.3-GHz CPU system with 64 GB of memory. We used the 2D POST DB in previous work [9]. Table 1 shows some statistics of the construction of all 3D POSTs for two- to six-pin nets and two to four tiers.

Table 1.
# pins (n)# PS (\(n!\))# 2D POSTs# tiers# all 3D POSTs (A)# gen. 3D POSTs (B)r (\(B/A\))Con. time (C)Con. eff. (\(B/C\))Table size
224224120.50.0 s\(\lt\) 1 KB
348240.50.0001 s\(\lt\) 1 KB
480400.50.0001 s\(\lt\) 1 KB
36162224840.3750.0001 s1 KB
38963360.3750.0003 s2 KB
42,3528880.3780.0006 s4 KB
424284220,0565,3720.2680.0043 s1,249,30235 KB
3226,80060,1200.2650.0457 s1,315,536313 KB
41,396,944367,4240.2630.3465 s1,060,3872 MB
51204,2602719,864125,3600.1740.1484 s844,744850 KB
314,876,9282,575,0920.1735.2478 s490,69916 MB
4142,195,68024,482,3540.17295.77 s255,637167 MB
6720120,212285,530,04013,831,2060.16220.13 s687,09493 MB
34,318,826,472697,355,2620.16142.2 m275,4175.1 GB
490,473,628,11214,586,090,8900.16130.2 h134,162129 GB
  • “# PS” is the number of 2D position sequences for the projected pins. “# all 3D POSTs” is the total number of 3D POSTs (A), and “# gen. 3D POSTs” is the number of 3D POSTs (B) generated from the proposed algorithm. r is \(B/A\). “Con. time” is the DB construction time (C). “Con. eff.” is the construction efficiency measured by \(B/C\) (# 3D POSTS generated per second).

Table 1. Statistics of the Construction of all 3D POSTs for Two- to Six-Pin Nets for Two, Three, and Four Tiers

  • “# PS” is the number of 2D position sequences for the projected pins. “# all 3D POSTs” is the total number of 3D POSTs (A), and “# gen. 3D POSTs” is the number of 3D POSTs (B) generated from the proposed algorithm. r is \(B/A\). “Con. time” is the DB construction time (C). “Con. eff.” is the construction efficiency measured by \(B/C\) (# 3D POSTS generated per second).

Our first observation is that as the tier count goes up from 2 to 4, the total number of 3D POSTs increases exponentially. This is because the number of combinations of placing pins in different tiers increases exponentially as the tier count goes up. The recurrence relation for the number of combinations is as follows: (2) \(\begin{equation} f(n,t) = t^n - \sum _{i=1}^{t-1} \lbrace (t-i+1) \cdot f(n,i)\rbrace , \end{equation}\) where \(f(n,t)\) is the number of combinations of placing n pins in t consecutive tiers. A closed-form expression for \(f(n,t)\) is as follows: (3) \(\begin{eqnarray} f(n,t) = t^n - 2 \cdot (t-1)^n + (t-2)^n, \end{eqnarray}\) (4) \(\begin{eqnarray} f(n,1) = 1. \end{eqnarray}\)

Thus, as t increases, \(f(n,t)\) goes up polynomially, and as the pin count goes up, the number of 2D POSTs increases exponentially as shown in the table. Thus, the total number of 3D POSTs increases extremely fast as the pin and tier counts go up. The number of generated 3D POSTs is approximately 16% of the total 3D POSTs. As explained in Section 3.2, using the congruence properties of PSs and 3D POSTs significantly reduces the number of 3D POSTs generated. Thus, we reduce the construction time and the DB size effectively.

The construction efficiency measured by the ratio between the number of generated 3D POSTs and the total construction time decreases almost exponentially as the pin count and the tier count go up. The algorithm can still construct approximately 130,000 3D POSTs per second for the six-pin four-tier case. However, there are almost 15 billion 3D POSTs to generate for the case, so the construction time is about 30 hours. The table size is approximately 135 GB, which can be easily handled in server computers.

Figure 9 shows two 3D POSTs constructed for the given six pins and the same 2D POST. The red edges are planar wires, and the blue edges are MIVs. The 3D POST in Figure 9(a) has five planar edges in Tier 0 and eight planar edges in Tier 1. However, the 3D POST in Figure 9(b) has 12 planar edges in Tier 2 and a planar edge in Tier 3. In addition, the planar coordinates of the MIVs in Tier 1 in Figure 9(a) are \((2,1)\) and \((2,3)\), whereas those in Tier 1 in Figure 9(b) are \((0,0)\) and \((1,4)\). Similarly, the planar coordinates of the MIVs in Tier 2 in Figure 9(a) are \((3,1)\), \((2,2)\), and \((4,5)\), whereas those in Figure 9(b) are \((0,0)\), \((1,4)\), and \((5,3)\). The planar coordinates of the MIVs in Tier 3 in Figure 9(a) are \((3,1)\) and \((4,5)\), whereas those in Figure 9(b) are \((2,1)\) and \((4,5)\). Thus, among the seven MIVs inserted in the two 3D POSTs, only one MIV located at \((4,5,3)\) is common and the other MIVs are located at quite different locations. We also found similar trends in many other 3D POSTs. Thus, we expect that the DB of the 3D POSTs can be used for 3D routing to evenly distribute planar wires and MIVs across the tiers.

Fig. 9.

Fig. 9. Comparison of two 3D POSTs constructed for pin locations \((0,0,0)\) , \((3,1,3)\) , \((2,2,2)\) , \((5,3,1)\) , \((1,4,0)\) , \((4,5,3)\) . 3D PS \(\Lambda _{(y_+,x_+,z_+)}\) : \(PS=(1 4 3 6 2 5)\) , \(TS=(0 3 2 1 0 3)\) .

Skip 5APPLICATION I: 3D ROUTING TOPOLOGY GENERATION Section

5 APPLICATION I: 3D ROUTING TOPOLOGY GENERATION

In this section and the next section, we present two applications for the practical use of the ARD. The first application is constructing 3D routing topologies for the optimization of given metrics. The two metrics we optimize are the Source-to-Critical-Sink Length (SCSL) used for wirelength minimization and the Source-to-Critical-Sink Delay (SCSD) used for timing optimization. The second application is congestion-aware 3D global routing for the minimization of routing congestion in M3D IC layouts.

5.1 Motivation

Figure 10 shows a two-tier 3D placement result for a five-pin net to be routed. The 3D PS is \(\Lambda _{(y_+,x_+,z_+)} = ((31542), (00011)),\) and the POWV is \((12211111)\). If we assume the length of each planar edge is l and the length of an MIV is \(l_m\), then all 3D POSTs in the figure have the same length of \((10l+l_m)\).

Fig. 10.

Fig. 10. Four MMRSMTs for the routing of a five-pin net in a two-tier design. 3D PS \(\Lambda _{(y_+,x_+,z_+)}\) : \(PS=(31542)\) , \(TS=(00011)\) , \(POWV=(12211111)\) .

Suppose the source of the net is \(p_5\) and the critical sink (the sink that has the smallest slack) is \(p_4\). Then, the SCSL is \((2l+l_m)\) in Figure 10(a), but that in Figure 10(b) is \((4l+l_m)\), so the former has a shorter SCSL. Similarly, suppose the source is \(p_2\) and the critical sink is \(p_4\). Then, all the topologies in Figure 10 have the same SCSL of 3l. We can also compute the SCSD using the PI model for the edges and the Elmore delay model for the delay estimation. Suppose the output resistance of the source is \(R_D\), the input capacitance of each sink is \(C_L\), the RC of an edge are r and c, respectively, and the RC of an MIV are \(r_m\) and \(c_m\), respectively. Then, if the source is \(p_5\) and the critical sink is \(p_4\), the SCSD in Figure 10(a) is smaller than that in Figure 10(b) by \(r(5C_L + 9c + c_m)\). Moreover, the two topologies in Figure 10(a) have the same 2D projection with different MIV locations. Thus, the topology on the left has less SCSD than the one on the right if \((\frac{r}{c} \lt \frac{r_m}{c_m})\), whereas the right one has less SCSD if \((\frac{r}{c} \gt \frac{r_m}{c_m})\). In addition, if the source is \(p_2\) and the critical sink is \(p_4\), the SCSD of the topologies in Figure 10(b) is smaller than that in Figure 10(a) by \(r(3C_L + 7c + c_m)\) as shown in Table 2. In conclusion, the effectiveness of a particular topology for a specific metric is dependent on the locations of the source and sinks and can be maximized only after examining all the MMRSMTs.

Table 2.
source:\(p_5\)\(\leadsto\) sink:\(p_4\)source:\(p_2\)\(\leadsto\) sink:\(p_4\)
SCSLSCSDSCSLSCSD
(a) left2l+lmTD + r(6CL+13c+2cm) + rm(2CL+3c+0.5cm)3lTD + r(12CL+25.5c+3cm)
(a) right2l+lmTD + r(6CL+13c+cm) + rm(2CL+4c+0.5cm)3lTD + r(12CL+25.5c+3cm)
(b) left4l+lmTD + r(11CL+22c+3cm) + rm(2CL+3c+0.5cm)3lTD + r(9CL+18.5c+2cm)
(b) right4l+lmTD + r(11CL+22c+2cm) + rm(2CL+4c+0.5cm)3lTD + r(9CL+18.5c+2cm)
  • TD = RD(4CL+10c+cm).

Table 2. Comparison of the SCSL and SCSD for the Topologies in Figure 10

  • TD = RD(4CL+10c+cm).

5.2 Simulation Methodology

To show the effectiveness of the use of the ARD, we compare two 3D routing topology generation approaches. The first is a so-called BF approach that selects one MMRSMT for each 3D net. If a 3D net is given, we select the first MMRSMT found in the ARD for the pin locations of the 3D net. The second approach is using the ARD for which we search the ARD for a given 3D net, find all MMRSMTs, and select the best one for a given metric (SCSL or SCSD). We used the ISPD 2005 and 2006 benchmarks [16, 17] and ePlace-3D [15] to generate 3D placement results in two, three, and four tiers. For the SCSD computation, we assume that the output resistance of a source (driver) is \(100 \Omega\), the wire RC per unit length are \(2 \Omega\)/unit and \(0.4fF\)/unit, respectively, and the load capacitance of a sink pin is 5fF [12]. The MIV height, resistance, and capacitance are 140 nm, \(4 \Omega\), and 1fF, respectively [7, 18]. For each 3D net, we set the source pin to the driver node of the net from the benchmark suites, randomly selected a pin for the critical sink, constructed two 3D routing topologies, one by the BF approach and the other by the ARD, and compared their SCSLs and SCSDs.

5.3 Simulation Results

Table 3 shows the comparison of the SCSLs of the 3D routing topologies constructed by the BF- and ARD-based approaches for all the 3D nets of the benchmarks with net degree 4 and 6. Notice that all 2D POSTs for each two- or three-pin net have the same SCSL regardless of the selection of the source and critical-sink pins. Therefore, the 3D routing topologies constructed for these nets by BF and ARD have the same SCSL. The BF and ARD have different SCSLs for approximately \(20.70\%\) to \(23.54\%\) of the 3D nets (\(N_L/N\)). Nonetheless, the average SCSL differences (\(L_T/N_L\)) between BF and ARD are approximately 103, 92, and 86 for the two-, three-, and four-tier designs, respectively. Moreover, we compute the sum of the ratios of each SCSL difference to its average to obtain the average difference ratio, which are 0.09, 0.11, and 0.14, respectively, for the two-, three-, and four-tier designs. The maximum SCSL differences are also very large (840 to 7,080), so the comparison shows that ARD can effectively minimize the SCSL for each 3D net.

Table 3.
# TBench.SCSLSCSD
Avg. Diff.# d. nets (%)Avg. Diff. (d.)Max. Diff.Avg. Diff. ratioAvg. Diff. (ps)# d. nets (%)Avg. Diff. (d.) (ps)Max. Diff. (ps)Avg. Diff. ratio
\(L_T \over N\)\({N_L \over N} \times {100}\)\(L_T \over N_L\)\(R_L \over N_L\)\(D_T \over N\)\({N_D \over N} \times {100}\)\(D_T \over N_D\)\(R_D \over N_D\)
2adaptec111.7219.8958.959600.1333.9458.0958.433,6980.16
adaptec220.6318.78109.852,4000.1569.7856.35123.838,530.10.16
adaptec319.3115.52124.452,4000.03218.3737.36584.5614,452.80.09
adaptec452.9132.91160.783,6000.06233.4743.13541.315,454.80.13
adaptec536.825.37145.024,5600.10243.9445.81532.4552,503.80.13
bigblue116.223.9767.61,0800.1533.2558.4956.852,964.40.18
bigblue261.7134.291801,3200.07711.7477.14922.6310,225.80.18
bigblue319.8321.6691.597,0800.09172.7758.86293.578,146.40.12
newblue122.5121.43105.042,5200.1582.8558.04142.767,469.30.15
newblue216.2821.0777.261,4400.1437.7958.4464.674,499.60.17
newblue413.125.0352.341,6800.1155.1558.893.796,0110.16
newblue532.7322.51145.383,2400.09153.7442.91358.3114,3390.11
newblue631.2427.3114.421,4400.04271.0550.79533.6312,109.60.10
Geo. mean24.0223.31103.060.09118.0753.28221.600.14
3adaptec116.9825.267.381,6800.0977.264.77119.28,4460.14
adaptec230.125.29119.015,040.30.08220.9256.54390.7332,344.40.13
adaptec32026.376.032,1600.08153.4455.06278.6721,513.40.15
adaptec411.8120.9856.281,6800.1160.1244.86134.016,831.30.15
adaptec516.4222.2773.742,4000.1366.7858.05115.0518,7010.18
bigblue115.1521.7769.599600.1343.385874.83,217.80.18
bigblue220.0117.79112.451,6800.2064.8556.98113.814,254.60.16
bigblue341.5923.75175.096,6000.11524.8160.61865.8787,378.80.15
newblue116.8623.6571.311,6800.1068.8857.39120.015,297.70.16
newblue219.5323.2883.884,8000.1274.3760.22123.521,886.90.16
newblue435.5625.49139.523,9600.10140.2954.93255.3913,062.60.17
newblue526.2125.75101.82,6400.08218.5561.56354.9917,204.60.13
newblue631.1126.23118.571,5600.10172.2559.69288.614,646.80.16
Geo. mean21.7123.5492.230.11111.6657.40194.550.15
4adaptec115.820.776.342,2800.1468.9560.15114.648,235.60.17
adaptec231.0322.71136.672,6400.1584.261.79136.2611,941.90.15
adaptec318.2720.5888.763,8400.1493.7853.5175.2927,445.70.18
adaptec418.2218.996.41,9200.1571.8854133.19,576.50.17
adaptec514.0421.4365.52,1600.1247.6454.7786.987,849.30.17
bigblue112.5721.8557.521,0800.1522.7260.337.681,831.40.18
bigblue29.5612.7375.058400.2817.2153.4632.191,272.10.20
bigblue321.3721.3100.353,4800.13168.4859.02285.4533,586.70.16
newblue116.9321.9477.162,1600.1269.1554.65126.525,406.80.19
newblue21924.0878.936,9600.10113.9757.07199.7183,711.10.15
newblue417.0721.9577.792,2800.1457.2156.27101.676,777.10.17
newblue522.6122.57100.142,4000.13104.2556.96183.0310,346.50.16
newblue624.521.02116.552,6400.14144.7157.07253.5515,780.20.17
Geo. mean17.7920.7085.970.1469.2156.79121.870.17
  • N, # four- to six-pin 3D nets; \(L_T\) (\(D_T\)), the sum of the SCSL (or SCSD) differences; \(R_L\) (\(R_D\)), the sum of the ratios of each SCSL (or SCSD) difference to its average; \(N_L\) (\(N_D\)), # 3D nets with nonzero SCSL (or SCSD) differences. “# T” denotes the number of tiers, “# d. nets (%)” denotes how many of the 3D nets have nonzero SCSL (or SCSD) differences. “Max. Diff.” denotes the maximum SCSL (or SCSD) differences.

Table 3. Comparison of the SCSL and SCSD of 3D Routing Topologies Constructed by BF- and ARD-Based Routing

  • N, # four- to six-pin 3D nets; \(L_T\) (\(D_T\)), the sum of the SCSL (or SCSD) differences; \(R_L\) (\(R_D\)), the sum of the ratios of each SCSL (or SCSD) difference to its average; \(N_L\) (\(N_D\)), # 3D nets with nonzero SCSL (or SCSD) differences. “# T” denotes the number of tiers, “# d. nets (%)” denotes how many of the 3D nets have nonzero SCSL (or SCSD) differences. “Max. Diff.” denotes the maximum SCSL (or SCSD) differences.

We also observe that the average SCSL difference (\(L_T/N_L\)) between BF and ARD generally goes down as the tier count goes up from 2 to 4. This is because stacking more tiers helps reduce the wirelength of each net. For example, the uniform-scaling-based 3D placement [1, 13] ideally reduces the wirelength of a net by \(1/\sqrt {t}\), where t is the number of tiers. As a result, the average SCSL difference also goes down as the tier count increases. However, the maximum SCSL difference is dependent not only on the 3D placement result but also on whether the 3D routing topologies constructed by BF can minimize the SCSLs by accident. Thus, the maximum SCSL difference does not go down even if the tier count goes up as shown in the table.

We observe similar trends in the SCSD simulation results. First of all, BF and ARD have different SCSDs for approximately \(53.28\%\) to \(57.40\%\) of the 3D nets (\(N_D/N\)). The reason that \(N_D/N\) is greater than \(N_L/N\) is that two 3D routing topologies with the same SCSL can have different SCSDs as shown in Figure 10 and Table 2. The average SCSD differences (\(D_T/N_D\)) between BF and ARD are approximately 222ps, 195ps, and 122ps for the two-, three-, and four-tier designs, respectively. The maximum SCSD differences are also large as shown in the table. Moreover, the SCSD average difference ratios are 0.14, 0.15, and 0.17, respectively, for the two-, three-, and four-tier designs. Although several interconnect optimization techniques such as buffer insertion would help reduce the SCSD, finding a good 3D routing topology would still be one of the most important interconnect optimization techniques. As shown previously, the ARD can provide multiple 3D routing topologies optimal for different metrics such as SCSL and SCSD.

Note that in some cases, minimizing the SCSL (or SCSD) of a net may not be compatible with constructing its RSMT. As the papers related to this work [2, 12] aimed at minimizing the planar wirelength (and then minimizing # vertical edges), we do not construct minimum-SCSL (or minimum-SCSD) topologies either. Instead, we find topologies minimizing the SCSL (or SCSD) among the MMRSMTs for a given net.

Skip 6APPLICATION II: CONGESTION-AWARE 3D ROUTING Section

6 APPLICATION II: CONGESTION-AWARE 3D ROUTING

In this section, we present the use of ARD for the minimization of routing congestion in M3D IC layouts. As shown in Figure 9, MMRSMTs might use very different 3D routing topologies. Thus, we can minimize routing congestion by selecting a good MMRSMT after an exhaustive search of all MMRSMTs for each 3D net.

6.1 Simulation Methodology

We used ePlace-3D [15] for 3D placement and bin-based 3D global routing for the congestion-aware 3D routing. Each x- or y-directional edge \(e_{c,i,j,k}\) (\(c \in \lbrace x, y\rbrace\)) has a predetermined maximum capacity \(m_{c,i,j,k}\) and the # nets \(s_{c,i,j,k}\) crossing the edge. Similarly, each bin \(bin_{z,i,j,k}\) has a predetermined maximum MIV capacity \(m_{bin,i,j,k}\) and the # MIVs \(s_{bin,i,j,k}\) located on that bin. We compute the overflow \(OF_{c,i,j,k}\) (or \(OF_{bin,i,j,k}\)) of edge \(e_{c,i,j,k}\) (or bin \(bin_{z,i,j,k}\)) as follows: (5) \(\begin{equation} OF_{w,i,j,k} = {\left\lbrace \begin{array}{ll} (s_{w,i,j,k} - m_{w,i,j,k}), & \text{if $s_{w,i,j,k} \gt m_{w,i,j,k}$} \\ 0, & \text{otherwise} \end{array}\right.} \end{equation}\) where \(w \in \lbrace c, bin\rbrace\). The objective is to minimize the total overflow in the following equation while routing all the 2D and 3D nets sequentially. (6) \(\begin{equation} OF_{Combined} = \alpha \cdot OF_{c,i,j,k} + \beta \cdot OF_{bin,i,j,k} \end{equation}\)

We chose \(\alpha\) and \(\beta\) suitably. We estimate the maximum MIV capacity of a bin by deducting the total area occupied by all instances of that bin from the total bin area and dividing the resulting area by the MIV pitch area. Moreover, MIV violations occur when the number of MIVs placed in a bin exceeds its maximum MIV capacity.

We route all the 2D and 3D nets of each design using two routing methodologies similar to the BF- and ARD-based routing methodologies used in Section 5. We route each net as follows:

2D nets (\(\le\) 8 pins): BF uses the FLUTE DB, so it finds only one RSMT for a 2D net. ARD uses the ARSMT DB, so it finds all RSMTs for a 2D net and selects the best one minimizing the overflow.

2D nets (\(\gt\) 8 pins): Both BF and ARD use the net-breaking technique proposed in FLUTE [2]. The net-breaking decomposes a high-degree net into multiple low-degree nets, uses the FLUTE DB to find an RSMT for each low-degree net, and inserts some additional (Steiner) points to connect the low-degree nets. Thus, BF and ARD use only one RSMT for a 2D net in this case.

3D nets (\(\le\) 6 pins): BF uses the ARD, but it finds only one MMRSMT in the DB and uses it for a 3D net. ARD also uses the ARD and finds all MMRSMTs for a 3D net and selects the best one minimizing the overflow.

3D nets (\(\gt\) 6 pins): Both BF and ARD use a net-breaking technique shown in the following.

However, we made an exception for the six-pin-four-tier case and used the net-breaking technique shown in the following for both the BF and ARD due to memory limitations. Moreover, we also routed all the nets of each design using MLOARST construction algorithms [8] to assess the efficacy of the ARD-based routing approach.

Note that global routing is a coarse-level bin-based routing step focusing on constructing routing topologies for a given design on a single routing layer under maximum capacity constraints. On the contrary, detailed routing that includes track assignment is a more fine-grained routing step based on the routing topologies obtained in the global routing step. In brief, since a single routing layer is often used for global routing purposes in the literature, we also consider similar conventions and problem definitions in this research.

6.2 3D Net-Breaking Techniques

Suppose N pins of a 3D net, \(P = \lbrace p_1, p_2,\ldots , p_N\rbrace\), are given and its 3D PS is \(((s_1 s_2 ... s_N), (t_1 t_2 ... t_N))\). If a group of the pins belongs to an octant and the others belong to its opposite octant, we can break the pins into two groups, construct an MMRSMT for each group, and connect the two MMRSMTs using an additional point. Figure 11(a) illustrates the octant pairs geometrically opposite in the 3D space. For example, if \(P_{G1} = \lbrace p_1,\ldots , p_r\rbrace\) belongs to the octant \(x_+y_+z_+\) and \(P_{G2} = \lbrace p_{r+1},\ldots , p_N\rbrace\) belongs to the opposite octant \(x_-y_-z_-\), we can find \(p_i \in P_{G1}\) and \(p_j \in P_{G2}\) closest to the origin. Then, we can insert a point \(p_h\) in the hexahedron constructed with \(p_i\) and \(p_j\) as the two endpoints of the hexahedron. Then, the union of the two MMRSMTs constructed for \(P_{G1} \cup \lbrace p_h\rbrace\) and \(P_{G2} \cup \lbrace p_h\rbrace\) is an MMRSMT for P. Figure 11(b) through (e) demonstrate the four octant pairs, (\(x_{-}y_{-}z_{-}\rightarrow x_{+}y_{+}z_{+}\)), (\(x_{-}y_{-}z_{+}\rightarrow x_{+}y_{+}z_{-}\)), (\(x_{+}y_{-}z_{-}\rightarrow x_{-}y_{+}z_{+}\)), and (\(x_{+}y_{-}z_{+}\rightarrow x_{-}y_{+}z_{-}\)), that can be used for 3D optimal net breaking.

Fig. 11.

Fig. 11. 3D optimal net-breaking techniques. (a) Four octant pairs opposite to each other in 3D space. The octant pairs in brown, green, cyan, and red are detailed through (b) \(x_{-}y_{-}z_{-}\rightarrow x_{+}y_{+}z_{+}\) , (c) \(x_{-}y_{-}z_{+}\rightarrow x_{+}y_{+}z_{-}\) , (d) \(x_{+}y_{-}z_{-}\rightarrow x_{-}y_{+}z_{+}\) , and (e) \(x_{+}y_{-}z_{+}\rightarrow x_{-}y_{+}z_{-}\) , respectively.

Let \(S_{G1}\) and \(S_{G2}\) be the PS values of \(P_{G1}\) and \(P_{G2}\), respectively. Similarly, let \(T_{G1}\) and \(T_{G2}\) be the TS values of \(P_{G1}\) and \(P_{G2}\), respectively. Then, the following inequalities show the conditions for the 3D optimal net breaking: (7) \(\begin{eqnarray} \mathrm{max}(S_{G1}) \le \mathrm{min}(S_{G2}) \ \ \& \ \ \mathrm{max}(T_{G1}) \le \mathrm{min}(T_{G2}), \end{eqnarray}\) (8) \(\begin{eqnarray} \mathrm{max}(S_{G1}) \le \mathrm{min}(S_{G2}) \ \ \& \ \ \mathrm{min}(T_{G1}) \ge \mathrm{max}(T_{G2}), \end{eqnarray}\) (9) \(\begin{eqnarray} \mathrm{min}(S_{G1}) \ge \mathrm{max}(S_{G2}) \ \ \& \ \ \mathrm{max}(T_{G1}) \le \mathrm{min}(T_{G2}), \end{eqnarray}\) (10) \(\begin{eqnarray} \mathrm{min}(S_{G1}) \ge \mathrm{max}(S_{G2}) \ \ \& \ \ \mathrm{min}(T_{G1}) \ge \mathrm{max}(T_{G2}), \end{eqnarray}\) where \(\mathrm{max}(A)\) and \(\mathrm{min}(A)\) find the maximum and minimum elements in A, respectively, and Inequalities (7) through (10) correspond to the cases shown in Figure 11(b) through (e), respectively. The figures also show new Steiner points () inserted for the 3D optimal net breaking. Note that in Figure 11(b) through (e), dots in red (), cyan (), green (), and black () indicate nodes on Tiers 3, 2, 1, and 0, respectively.

Notice that a 3D net cannot be optimally broken if we cannot find \(P_{G1}\) and \(P_{G2}\) satisfying any of the inequality pairs in (7) through (10). In this case, we first project all the pins of the 3D net onto the xy plane and perform 2D optimal net breaking for the projected pins. If this is successful, there will be two groups of the projected pins, so we construct an MMRSMT for each pin group and connect the two MMRSMTs. If this is not successful, however, we use the 2D net-breaking heuristics for the projected pins [2], construct an MMRSMT for each pin group, and connect all the MMRSMTs.

6.3 Simulation Results

Table 4 compares the planar overflows and the number of MIV violations of the congestion-aware 3D global routing by the BF- and ARD-based routing. The number of global nets shows the total number of nets routed optimally (by 2D RSMTs and MMRSMTs) or nonoptimally, whereas the number of routed nets shows the total number of nets routed optimally. For all the designs, most of the nets (around 94% on average) are routed optimally because they are low-degree nets. Table 5 shows the details of the # “k-Pin” nets for each “k.”

Table 4.
# TBench.# Bins per tier# Nets# Planar overflowsAvg. overflowMax. overflow# MIV violationsRT (s)
GlobalRoutedBFARDBFARDBFARDBFARD
2adaptec1140\(\times\)140152.4k142.03k (0.93)48721485 (0.3)0.060.0217158685356 (0.52)4.12
adaptec2184\(\times\)184165.04k153.46k (0.93)200055470 (0.27)0.150.0423961885486 (0.55)5.03
adaptec3289\(\times\)287303.61k286.45k (0.94)444373000 (0.07)0.130.0124070980498 (0.51)10.62
adaptec4282\(\times\)285322.19k308.03k (0.96)114761183 (0.1)0.040145621947621 (0.32)8.68
adaptec5289\(\times\)287517.4k487.8k (0.94)9086118256 (0.2)0.270.063381431544697 (0.45)17.4
bigblue1140\(\times\)140187.19k174.48k (0.93)230069196 (0.4)0.30.12223198962451 (0.47)5.14
bigblue2233\(\times\)232334.71k318.89k (0.95)00 (0)0000729580 (0.8)8.42
bigblue3344\(\times\)343601.03k569.08k (0.95)516833220542 (0.43)1.10.47136049832041564 (0.49)18.29
newblue1148\(\times\)148195.92k187.13k (0.96)101601962 (0.19)0.120.0214968686232 (0.34)4.55
newblue2286\(\times\)344353.82k332.82k (0.94)519503103129 (0.2)1.320.2679762731491563 (0.5)8.68
newblue4226\(\times\)225432.65k410.11k (0.95)3118910346 (0.33)0.150.052189529141392 (0.48)10.89
newblue5309\(\times\)314678.91k622.5k (0.92)12524864738 (0.52)0.320.1753130119001078 (0.57)29.92
newblue6343\(\times\)340836.87k785.75k (0.94)4301120794 (0.48)0.090.04263173881297 (0.34)30.47
Geo. mean0.940.250.47
3adaptec1116\(\times\)116151.89k141.28k (0.93)60081440 (0.24)0.080.02131621140682 (0.6)4.07
adaptec2152\(\times\)152164.93k153.43k (0.93)153612190 (0.14)0.110.02136531402864 (0.62)5.99
adaptec3236\(\times\)235303.25k285.45k (0.94)10541621717 (0.21)0.320.076412441951794 (0.41)9.31
adaptec4231\(\times\)233318.46k304.27k (0.96)9605697 (0.07)0.030110501812678 (0.37)7.44
adaptec5236\(\times\)235515.16k484.16k (0.94)10625523917 (0.23)0.320.0735015030501835 (0.6)16.6
bigblue1116\(\times\)116182.98k170.35k (0.93)226909905 (0.44)0.280.122231311485720 (0.48)5.08
bigblue2190\(\times\)189333.12k317.12k (0.95)183 (0.17)00122951591 (0.62)7.71
bigblue3281\(\times\)280600.61k567.58k (0.95)540436132427 (0.25)1.150.2888030984795248 (0.62)33.26
newblue1123\(\times\)123194.36k185.79k (0.96)148503612 (0.24)0.160.0415689998432 (0.43)3.58
newblue2234\(\times\)281355.91k334.71k (0.94)566695125180 (0.22)1.440.3280972442532628 (0.62)10.3
newblue4185\(\times\)184429.16k406.88k (0.95)3057715436 (0.5)0.150.0824814531591257 (0.4)10.59
newblue5258\(\times\)257694.69k636.79k (0.92)315083123946 (0.39)0.80.3177226224811218 (0.49)29.68
newblue6281\(\times\)278829.96k778.91k (0.94)2815111063 (0.39)0.060.021641182317997 (0.43)31.5
Geo. mean0.940.240.51
4adaptec1101\(\times\)101152.77k142.04k (0.93)42301058 (0.25)0.050.01187811724956 (0.55)3.73
adaptec2133\(\times\)133162.43k150.92k (0.93)121406572 (0.54)0.090.0528716519471202 (0.62)5.04
adaptec3204\(\times\)203302.21k283.98k (0.94)9517119696 (0.21)0.290.0647030734481765 (0.51)7.9
adaptec4204\(\times\)203315.78k301.58k (0.96)182302388 (0.13)0.060.01164761791605 (0.34)7.07
adaptec5204\(\times\)203515.64k483.84k (0.94)9275214116 (0.15)0.280.0440912739572210 (0.56)14.91
bigblue1101\(\times\)101182.69k169.35k (0.93)208335945 (0.29)0.260.0727111722121169 (0.53)4.25
bigblue2165\(\times\)164331.8k315.75k (0.95)227 (0.32)001561400856 (0.61)7.13
bigblue3254\(\times\)253591.64k559.22k (0.95)448410170041 (0.38)0.880.33188067071404310 (0.6)16.5
newblue1112\(\times\)112192.63k183.94k (0.95)136482495 (0.18)0.140.03196711256401 (0.32)3.57
newblue2211\(\times\)254355.85k334.26k (0.94)548915100936 (0.18)1.290.2492765567124145 (0.62)9.67
newblue4160\(\times\)159429.9k406.98k (0.95)3499215425 (0.44)0.170.0820813337271864 (0.5)9.54
newblue5224\(\times\)223682.7k625.42k (0.92)15001379262 (0.53)0.380.253135942682537 (0.59)24.95
newblue6243\(\times\)242829.69k778.15k (0.94)312217641 (0.24)0.070.022598827191188 (0.44)25.49
Geo. mean0.940.270.51
  • “# T” denotes the number of tiers, “# global nets” is the total number of nets routed optimally (2D RSMTs and MMRSMTs) or nonoptimally, “# routed nets” is the total number of nets routed optimally, and “RT” denotes the runtime which is the global routing time of the ARD-based routing.

Table 4. Comparison of the Planar Edge Overflows and the # MIV Violations of Congestion-Aware 3D Global Routing by BF- and ARD-Based Routing

  • “# T” denotes the number of tiers, “# global nets” is the total number of nets routed optimally (2D RSMTs and MMRSMTs) or nonoptimally, “# routed nets” is the total number of nets routed optimally, and “RT” denotes the runtime which is the global routing time of the ARD-based routing.

Table 5.
# TBench.# 2D “k-Pin” nets# 3D “k-Pin” nets
“2”“3”“4”“5”“6”“7”“8”“>8”“2”“3”“4”“5”“6”“>6”
2atec189184244071022350823799275721069249178912757224422391125
atec210431420381841353163618262619329488394612868614343342089
atec31951044237219505108536517378028181711042267552032704944
atec421308549957182878239486032212287141257345431229602442
atec532751572655331771822712631893264942809660439905324091941507
bbl11109222701812647780551353115194010510358710545294213102196
bbl22319564758620382741741182738228315672186250816118151
bbl340273674093345751782195786074406224519121973714174414849977437
nbl112901727026113125955397026441921781131581118492302214981
nbl22071756622522466987252993253240310465751150841560115783710509
nbl426600165972281151528295816651510620473975418427996153922065
nbl545813774959333501798012238882873855616865701876698367112239
nbl653349411918156056298431917413514978451033360447230822210086
3atec188329237101006550453531263420318578270114828025214312028
atec2104077196358449537037552750199510142431217156914522331352
atec31905464120718665100366086366725301455076772113136110025613251
atec42100264863917956788447443113216813684817786841917894509
atec53216536930831426173241112684655970249571036336522187168513056040
bbl11072932605612120760451283078195410051460812065644013352582
bbl22299864681320232741941642656214015283242284221414981721
bbl33904587061032805166969424570036172158724089688034212363152011442
nbl112688026436112995940386628042001786445211228424210178706
nbl2207022655522199197915258324023299618810465332354156397211583
nbl426417464030281691489891986671509920880958128599698523801401
nbl546580775718333351802411977880370805253496383390147710405015369
nbl65228281160445458328454184011312096874876710808248812497654792285
4atec187345237509938483035542647201180843809197110836844352627
atec29965618864780149233533260820198929668325479708575022536
atec31876293922718017976457583314233713553103723596181713907694671
atec42086654811117620775945703030232312916703413985932921891283
atec53195166816230759168621100780255758236791262646472955204515108090
bbl110566924802113186785461325561699897862072455130310858634359
bbl22270854647520141731339792488220214403347215255623431631646
bbl33868507076332351164579254564436452050721385631133692075130611725
nbl1124625255871088857133983262418827583551516188284042731104
nbl220141464133211319120515831482128865913750828831691868112312759
nbl426240463259264821408587776408504619106116364096226016408893808
nbl5454105728753220817421117488482693849661131114447197814356927597
nbl652176111419453849284621835512798965646655116473677178312157574879
  • “atec,” “bbl,” and “nbl” denote “adaptec,” “bigblue,” and “newblue,” respectively.

Table 5. Details of the Benchmarks of the Two-, Three-, and Four-Tier Designs of Bin-Based 3D Global Routing Showing # 2D and 3D “k-Pin” Nets Separately for Each “k”

  • “atec,” “bbl,” and “nbl” denote “adaptec,” “bigblue,” and “newblue,” respectively.

For the two-tier designs, the ratios of the total number of planar overflows and MIV violations between BF and ARD are 0.25 and 0.47 on average, respectively, which demonstrate that the use of ARD can reduce the routing congestion effectively. ARDs are effective because they minimize planar overflow while distributing MIVs according to available whitespace. For each net, the minimum planar wirelength is used and then MIVs are minimized. The average planar overflows (# planar overflows per edge capacity) and the maximum planar overflows of ARD are also lower than those of BF by 48% to 93% and 11% to 74%, respectively. Furthermore, the # MIV violations of ARD is 20% to 68% fewer than their BF counterparts. Note that a higher number of tiers means more bins accepting MIVs, which boosts the overall capacity of MIVs. Consequently, we expect fewer MIV violations. Nevertheless, the # MIVs will also rise due to the increasing number of 3D wires. Thus, with more tier inclusion, the # MIV violations varies either way (increasing or decreasing).

The runtime of the BF approach is less than 1 second for small benchmarks and maximum 2 seconds for the largest design. However, ARD takes 4 to 30 seconds for the routing of all the nets. The runtime overhead is negligible, but the overflow reduction is significantly large, which shows the effectiveness of the ARD for the routing congestion minimization. We observe similar trends in the three- and four-tier designs. The planar overflow ratio between the BF and ARD designs is 0.24 for the three-tier designs and 0.27 for the four-tier designs on average provided 49% less MIV violations on average for both the cases. ARD still achieves 11% to 83% lower maximum planar overflows with 38% to 63% fewer MIV violations for the three-tier designs and 29% to 69% lower maximum planar overflows with 38% to 68% fewer MIV violations for the four-tier designs.

Table 6 compares the planar overflows and the number of MIV violations of congestion-aware 3D global routing by the MLOARST construction algorithms (denoted by MR) and ARD-based routing. For the two-, three-, and four-tier designs, the ratios of the total number of planar overflows between MR and ARD are 0.26, 0.25, and 0.25 on average, respectively, and are 0.18, 0.16, and 0.16 on average, respectively, for the number of MIV violations. ARD’s average and maximum planar overflows are consistently lower with significantly fewer # MIV violations compared to MR. This is because even with the minimized planar wirelength and MIV count, the MR only generates one routing topology demonstrating the effectiveness of ARD with multiple topologies to reduce routing congestion. However, only for the four-tier bigblue2 design, the ARD shows slightly more planar and maximum overflow than MR. This is because nets were sequentially routed in the ARD. Routing topologies were chosen to minimize the total overflow that ultimately depends on the net routing order. Still, the # MIV violations are significantly fewer for ARD compared to MR.

Table 6.
Benchmark# Planar overflowsAvg. overflow
2 tier3 tier4 tier2 tier3 tier4 tier
MRARDMRARDMRARDMRARDMRARDMRARD
adaptec147841485 (0.31)50671440 (0.28)78621058 (0.13)0.060.020.060.020.10.01
adaptec2100015470 (0.55)136562190 (0.16)116856572 (0.56)0.070.040.10.020.080.05
adaptec3639033000 (0.05)10672821717 (0.2)9384819696 (0.21)0.190.010.320.070.280.06
adaptec4148901183 (0.08)13344697 (0.05)129132388 (0.18)0.0500.0400.040.01
adaptec59437718256 (0.19)10021023917 (0.24)8705714116 (0.16)0.290.060.30.070.260.04
bigblue1179359196 (0.51)186759905 (0.53)168695945 (0.35)0.230.120.230.120.210.07
bigblue200253 (0.12)07000000
bigblue3535776220542 (0.41)585782132427 (0.23)435857170041 (0.39)1.140.471.250.280.850.33
newblue175361962 (0.26)87223612 (0.41)54412495 (0.46)0.090.020.10.040.050.03
newblue2518172103129 (0.2)555603125180 (0.23)490522100936 (0.21)1.320.261.410.321.150.24
newblue43409810346 (0.3)3265515436 (0.47)2950415425 (0.52)0.170.050.160.080.150.08
newblue513266464738 (0.49)246577123946 (0.5)131643679262 (0.06)0.340.170.620.313.310.2
newblue64197920794 (0.5)2484611063 (0.45)264567641 (0.29)0.090.040.050.020.060.02
Geo. mean0.260.250.25
Benchmark# MIV violationsMax. overflow
2 tier3 tier4 tier2 tier3 tier4 tier
MRARDMRARDMRARDMRARDMRARDMRARD
adaptec11975356 (0.18)3202682 (0.21)4887956 (0.2)17358746225581
adaptec22857486 (0.17)3195864 (0.27)54791202 (0.22)1836114253288165
adaptec32392498 (0.21)7215794 (0.11)100971765 (0.17)46770617244482307
adaptec43222621 (0.19)5082678 (0.13)5674605 (0.11)142621065013676
adaptec53908697 (0.18)106611835 (0.17)130642210 (0.17)359143359150359127
bigblue13053451 (0.15)3881720 (0.19)63601169 (0.18)225198241131206117
bigblue21846580 (0.31)3557591 (0.17)5769856 (0.15)0014206
bigblue385811564 (0.18)158535248 (0.33)187124310 (0.23)103549810423091450670
newblue12270232 (0.1)2790432 (0.15)4310401 (0.09)1076890898171
newblue2122781563 (0.13)184292628 (0.14)222934145 (0.19)797627827724794655
newblue461861392 (0.23)72211257 (0.17)92881864 (0.2)21795249145215133
newblue548451078 (0.22)112721218 (0.11)164652537 (0.15)553301761262563359
newblue62173297 (0.14)9231997 (0.11)110761188 (0.11)28317316111823888
Geo. mean0.180.160.16

Table 6. Comparison of the Planar Edge Overflows and the # MIV Violations of Congestion-Aware 3D Global Routing by MR- and ARD-Based Routing

Moreover, Table 7 shows a detailed comparison of the # 3D nets, the total Half-Perimeter Wirelength (HPWL), the MIV distribution on different tiers, and the total # MIVs for the two-, three-, and four-tier designs for all the benchmarks. Except for adaptec4 and bigblue3, the number of 3D nets increases as the tiers climb. The total HPWL is computed from the HPWL of the 2D nets plus the 2D projection of the 3D nets. The HPWL decreases from the two-tier design to the four-tier design for all the benchmarks except for adaptec2, bigblue3, newblue2, and newblue5, which is purely dependent on the placement locations of the instances. We obtained the placement results from the 3D placer [15] that showed a similar trend to the HPWL as well. Note that all the routed designs have a minimum planar wirelength. Furthermore, the number of MIVs increases from two-tier to three-tier designs and from three-tier to four-tier designs except for the adaptec4 and bigblue3 benchmarks. The reason is that the three-tier designs have more 3D nets than the four-tier designs for these benchmarks discussed earlier. Ideally, the # MIVs should exceed the # 3D nets. Moreover, in most cases, two-tier nets dominate the set of 3D nets. The MIV distribution on each tier is therefore related to the 3D net distribution on that tier to its lower neighboring tier.

Table 7.
Benchmark# 3D netsHPWL
2 tier3 tier4 tier2 tier3 tier4 tier
adaptec144675937796863.7458.4155.63
adaptec2686174031152078.2581.4166.92
adaptec355031271417938153.46148.83133.92
adaptec4808997369506136.93123.29116.78
adaptec581681919223751241.89233.62216.19
bigblue1590171141191180.3476.1568.69
bigblue224053708606511097.9591.18
bigblue3201363827334251334.42483.1317.25
newblue152846561863761.9756.6352.99
newblue2161491952628027186.99217.63237.50
newblue4134021464120521177.93166.34157.74
newblue596231604621647319.97332.30283.11
newblue647061578919077373.24338.53317.19
Benchmark# MIVs
2 tier3 tier4 tier
T1 (Total)T1T2TotalT1T2T3Total
adaptec14695342834846912 (1.47)34294171278110381 (2.21)
adaptec27144457738338410 (1.18)32747380414914803 (2.07)
adaptec3552310826410814934 (2.70)97359849385423438 (4.24)
adaptec481148684450313187 (1.63)78213791130512917 (1.59)
adaptec5825017815546623281 (2.82)183396779278727905 (3.38)
bigblue16153511727497866 (1.28)59876520222314730 (2.39)
bigblue22419279815474345 (1.80)432218738187013 (2.90)
bigblue320861235032151245015 (2.16)13380187991185144030 (2.11)
newblue15499202347416764 (1.23)18884834352810250 (1.86)
newblue216807135391187325412 (1.51)1560415145707937828 (2.25)
newblue4138349722642516147 (1.17)114557708448323646 (1.71)
newblue598239651807417725 (1.80)118566881742126158 (2.66)
newblue6477112626678919415 (4.07)150114956243122398 (4.69)
Geo. mean \(\left(\# \text{MIVs (X-tier)} \over \# \text{MIVs (2-tier)}\right)\)1.772.48
  • The HPWL includes both 2D and 3D nets and measures in meters.

Table 7. Comparison of the # 3D Nets, HPWL, the MIV Distribution on Different Tiers, and the Total # MIVs for the Two-, Three-, and Four-Tier Designs for All Benchmarks of Congestion-Aware 3D Global Routing

  • The HPWL includes both 2D and 3D nets and measures in meters.

In conclusion, controlling the MIV density is crucial for the MIV violations, which are not trivial, and should be considered after minimizing the planar wirelength for the congestion-aware 3D global routing. In addition to that, by integrating our ARD into the ePlace-3D, we anticipate that the 3D placement engine will be able to manage the # 3D nets, the overall HPWL, and the vertical interconnects better.

6.4 Proposed Approach: Five Tiers and Above

Suppose we have an ARD constructed for two, three, and four tiers. For more than four tiers, we can construct 3D routing topologies using the ARD as follows (considering a five-tier case):

First, we begin the routing topology construction by projecting all the pins in the tiers above Tier 3 onto Tier 3. Now, all the pins are located in four tiers, so we can use the ARD to find all MMRSMTs for the (projected) pin locations. Then, we move the projected pins back to their original tiers. When we expand them, we insert vertical edges to connect the projected pins and the MMRSMTs. Of course, we can project the pins in many different ways (e.g., project the pins in the tiers below Tier 1 onto Tier 1), which will help generate many different routing topologies.

One of the problems of this methodology is that it does not use planar wires in the tiers above Tier 3, which might cause routing congestion in Tier 0 through Tier 3. We can solve this problem in many different ways like the following:

Reassign planar edges after the vertical expansion step: Moving a planar edge to a neighboring tier and adjusting the locations of the relevant vertical edges could result in a new 3D routing topology without any overhead. For example, if an end point of a planar edge in Tier k is connected to a vertical edge connecting Tier k and Tier (k+1), we can move the planar edge to Tier (k+1) without any overhead.

Run the ARD construction algorithm (Algorithm 1) for each 2D POST for a given net: Since Algorithm 1 can be applied to any number of tiers, this methodology can find all MMRSMTs for the given pin locations (as long as the number of pins is less than 10). One problem of this methodology is the runtime overhead. Although the runtime of Algorithm 1 for a single 3D net could be small, running it for many 3D nets might need a long runtime. Thus, we can apply this methodology only to timing-critical 3D nets and the other methodologies to most of the other 3D nets.

Skip 7CONCLUSION Section

7 CONCLUSION

Routing of 3D nets in the design of M3D IC layouts that uses tiny MIVs requires distributing 2D wires evenly, minimizing the planar wirelength and the number of MIVs simultaneously in each tier. In this article, we proposed an algorithm for building a DB of 3D POSTs that helps generate MMRSMTs swiftly to route M3D ICs optimizing 2D and 3D interconnects. The DB size is manageable for up to four-tier six-pin 3D POSTs. We applied the DB to construct timing-driven 3D routing topologies minimizing SCSLs and SCSDs. We also proposed a 3D optimal net-breaking technique and performed congestion-aware global routing on 3D designs minimizing the congestion cost. Both the applications evidenced the efficacy of using the ARD. We anticipate that the proposed algorithm and the DB of the 3D POSTs will help various VLSI CAD algorithms effectively optimize 3D IC layouts and serve as a baseline for the algorithms for better M3D IC routing.

APPENDIX

A PS ALGEBRA

Skip A.1Definition Section

A.1 Definition

A sequence of size n is an n-tuple, \((a_1,\ldots , a_n)\). A 2D position sequence (PS) of size n is a sequence \((a_1,\ldots , a_n)\) of natural numbers such that \(1 \le a_k \le n\) and \(a_i \ne a_j\) if \(i \ne j\). A tier sequence (TS) of size n is a sequence \((a_1,\ldots , a_n)\) such that \(a_k \in \lbrace 0, 1,\ldots , t-1\rbrace\) for \(1 \le k \le n\) and there exist at least one \(a_k\) for each \(i \in \lbrace 0, t-1\rbrace\) such that \(a_k = i\) (t is the number of tiers and is given). A 3D position sequence of size n, \(\Lambda = (\Gamma _1, \Gamma _2)\), is a pair of a 2D PS \(\Gamma _1\) of size n and a TS \(\Gamma _2\) of size n. A PS generally means a 2D PS in this article. We define \(\Gamma _{(y_{\pm },x_{\pm })}\) and \(\Gamma _{(x_{\pm },y_{\pm })}\) as described in Section 2.1.3. We also define \(\Lambda _{(y_{\pm },x_{\pm }, z_{\pm })}\) and \(\Lambda _{(x_{\pm },y_{\pm }, z_{\pm })}\) as described in Section 3.2.2.

Skip A.2Operations and Functions Section

A.2 Operations and Functions

We define the addition operation for two sequences \(\Gamma _1 = (a_1,\ldots , a_n)\) and \(\Gamma _2 = (b_1,\ldots , b_n)\) as follows: (11) \(\begin{equation} \Gamma _1 + \Gamma _2 = (a_1+b_1,\ldots , a_n+b_n). \end{equation}\) We also define the inversion operation for a sequence \(\Gamma = (a_1,\ldots , a_n)\) as follows: (12) \(\begin{equation} \overline{\Gamma } = (a_n,\ldots , a_1). \end{equation}\) Notice that \(\Gamma _1 + \Gamma _2 = \Gamma _2 + \Gamma _1\) and \(\overline{\overline{\Gamma }} = \Gamma\). The following function maps \(a_i\) in \(\Gamma = (a_1,\ldots , a_n)\) to i: (13) \(\begin{equation} \phi (a_i) = i. \end{equation}\) Applying \(\phi\) to a PS \(\Gamma\) results in a new sequence as follows: (14) \(\begin{equation} \phi (\Gamma) = (\phi (1),\ldots , \phi (n)). \end{equation}\) For example, suppose \(\Gamma = (31542)\). Then, \(\phi (a_1) = \phi (3) = 1\), \(\phi (a_2) = \phi (1) = 2\), \(\phi (a_3) = \phi (5) = 3\), \(\phi (a_4) = \phi (4) = 4\), and \(\phi (a_5) = \phi (2) = 5\). Then, \(\phi (\Gamma) = (25143)\). If \(i \ne j\), then \(a_i \ne a_j\) because \(\Gamma\) is a PS. In addition, \(1 \le \phi (a_i) \le n\), so \(\phi (\Gamma)\) is also a PS. For a 3D PS \(\Lambda = (\Gamma _1, \Gamma _2)\), we denote its 2D PS and TS by \(\mathrm{PS}(\Lambda)\) and \(\mathrm{TS}(\Lambda)\), respectively.

Skip A.3Properties of 2D PSs Section

A.3 Properties of 2D PSs

PSs have the following properties: (15) \(\begin{eqnarray} \Gamma _{(s,x_+)} + \Gamma _{(s,x_-)} = (n+1,\ldots ,n+1), \end{eqnarray}\) (16) \(\begin{eqnarray} \Gamma _{(s,y_+)} + \Gamma _{(s,y_-)} = (n+1,\ldots ,n+1), \end{eqnarray}\) (17) \(\begin{eqnarray} \overline{\Gamma _{(x_+,r)}} = \Gamma _{(x_-,r)}, \end{eqnarray}\) (18) \(\begin{eqnarray} \overline{\Gamma _{(y_+,r)}} = \Gamma _{(y_-,r)}. \end{eqnarray}\) For the pins in Figure 2(a), for example, \(\Gamma _{(y_+,x_+)} + \Gamma _{(y_+,x_-)} = (66666)\) and \(\overline{\Gamma _{(y_+,x_+)}} = \overline{(31542)} = (24513) = \Gamma _{(y_-,x_+)}\). PSs also have the following properties: (19) \(\begin{equation} \phi (\Gamma _{(s,r)}) = \Gamma _{(r,s)}. \end{equation}\) For example, \(\phi (\Gamma _{(y_+,x_+)}=(31542)) = (25143) = \Gamma _{(x_+,y_+)}\), \(\phi (\Gamma _{(y_+,x_-)}=(35124)) = (34152) = \Gamma _{(x_-,y_+)}\), \(\phi (\Gamma _{(y_-,x_+)}=(24513)) = (41523) = \Gamma _{(x_+,y_-)}\), and \(\phi (\Gamma _{(y_-,x_-)}=(42153)) = (32514) = \Gamma _{(x_-,y_-)}\). Using the properties shown previously, we can generate seven PSs from a PS as shown in Table 8.

Table 8.
To
\((y_+,x_+)\)\((y_+,x_-)\)\((y_-,x_+)\)\((y_-,x_-)\)\((x_+,y_+)\)\((x_+,y_-)\)\((x_-,y_+)\)\((x_-,y_-)\)
From\((y_+,x_+)\)-\((\overline{a_1},\ldots , \overline{a_n})\)\((a_n,\ldots , a_1)\)\((\overline{a_n},\ldots , \overline{a_1})\)\((b_1,\ldots , b_n)\)\((\overline{b_1},\ldots , \overline{b_n})\)\((b_n,\ldots , b_1)\)\((\overline{b_n},\ldots , \overline{b_1})\)
\((y_+,x_-)\)\((\overline{a_1},\ldots , \overline{a_n})\)-\((\overline{a_n},\ldots , \overline{a_1})\)\((a_n,\ldots , a_1)\)\((b_n,\ldots , b_1)\)\((\overline{b_n},\ldots , \overline{b_1})\)\((b_1,\ldots , b_n)\)\((\overline{b_1},\ldots , \overline{b_n})\)
\((y_-,x_+)\)\((a_n,\ldots , a_1)\)\((\overline{a_n},\ldots , \overline{a_1})\)-\((\overline{a_1},\ldots , \overline{a_n})\)\((\overline{b_1},\ldots , \overline{b_n})\)\((b_1,\ldots , b_n)\)\((\overline{b_n},\ldots , \overline{b_1})\)\((b_n,\ldots , b_1)\)
\((y_-,x_-)\)\((\overline{a_n},\ldots , \overline{a_1})\)\((a_n,\ldots , a_1)\)\((\overline{a_1},\ldots , \overline{a_n})\)-\((\overline{b_n},\ldots , \overline{b_1})\)\((b_n,\ldots , b_1)\)\((\overline{b_1},\ldots , \overline{b_n})\)\((b_1,\ldots , b_n)\)
\((x_+,y_+)\)\((b_1,\ldots , b_n)\)\((\overline{b_1},\ldots , \overline{b_n})\)\((b_n,\ldots , b_1)\)\((\overline{b_n},\ldots , \overline{b_1})\)-\((\overline{a_1},\ldots , \overline{a_n})\)\((a_n,\ldots , a_1)\)\((\overline{a_n},\ldots , \overline{a_1})\)
\((x_+,y_-)\)\((b_n,\ldots , b_1)\)\((\overline{b_n},\ldots , \overline{b_1})\)\((b_1,\ldots , b_n)\)\((\overline{b_1},\ldots , \overline{b_n})\)\((\overline{a_1},\ldots , \overline{a_n})\)-\((\overline{a_n},\ldots , \overline{a_1})\)\((a_n,\ldots , a_1)\)
\((x_-,y_+)\)\((\overline{b_1},\ldots , \overline{b_n})\)\((b_1,\ldots , b_n)\)\((\overline{b_n},\ldots , \overline{b_1})\)\((b_n,\ldots , b_1)\)\((a_n,\ldots , a_1)\)\((\overline{a_n},\ldots , \overline{a_1})\)-\((\overline{a_1},\ldots , \overline{a_n})\)
\((x_-,y_-)\)\((\overline{b_n},\ldots , \overline{b_1})\)\((b_n,\ldots , b_1)\)\((\overline{b_1},\ldots , \overline{b_n})\)\((b_1,\ldots , b_n)\)\((\overline{a_n},\ldots , \overline{a_1})\)\((a_n,\ldots , a_1)\)\((\overline{a_1},\ldots , \overline{a_n})\)-
  • From: \(\Gamma _{(s,r)} = (a_1,\ldots , a_n)\). \(\phi (\Gamma _{(s,r)}) = (b_1,\ldots , b_n)\). \(\overline{k} = (n+1)-k\).

Table 8. Congruence Rules

  • From: \(\Gamma _{(s,r)} = (a_1,\ldots , a_n)\). \(\phi (\Gamma _{(s,r)}) = (b_1,\ldots , b_n)\). \(\overline{k} = (n+1)-k\).

Skip A.4Properties of 3D PSs Section

A.4 Properties of 3D PSs

First of all, the 2D PSs of 3D PSs have the properties shown in (15) through (19). The TSs of 3D PSs have the following properties: (20) \(\begin{eqnarray} \mathrm{TS}(\Lambda _{(s,r,z_+)}) + \mathrm{TS}(\Lambda _{(s,r,z_-)}) = (t-1,\ldots ,t-1), \end{eqnarray}\) (21) \(\begin{eqnarray} \mathrm{TS}(\Lambda _{(s,x_+,w)}) = \mathrm{TS}(\Lambda _{(s,x_-,w)}), \end{eqnarray}\) (22) \(\begin{eqnarray} \mathrm{TS}(\Lambda _{(s,y_+,w)}) = \mathrm{TS}(\Lambda _{(s,y_-,w)}), \end{eqnarray}\) (23) \(\begin{eqnarray} \overline{\mathrm{TS}(\Lambda _{(x_+,r,w)})} = \mathrm{TS}(\Lambda _{(x_-,r,w)}), \end{eqnarray}\) (24) \(\begin{eqnarray} \overline{\mathrm{TS}(\Lambda _{(y_+,r,w)})} = \mathrm{TS}(\Lambda _{(y_-,r,w)}). \end{eqnarray}\) For the pins in Figure 8, for example, \(\mathrm{TS}(\Lambda _{(y_+,x_+,z_+)}) + \mathrm{TS}(\Lambda _{(y_+,x_+,z_-)}) = (11111)\), \(\mathrm{TS}(\Lambda _{(y_+,x_+,z_-)}) = (10101) = \mathrm{TS}(\Lambda _{(y_+,x_-,z_-)})\), and \(\overline{\mathrm{TS}(\Lambda _{(x_+,y_-,z_-)})} = \overline{(01101)} = (10110) = \mathrm{TS}(\Lambda _{(x_-,y_-,z_-)})\). Table 8 and the properties of TSs in (20) through (24) can be used for the congruence mapping of 2D and 3D PSs.

Footnotes

  1. 1 A 2D (3D) net is a net connecting instances placed in a single tier (different tiers).

    Footnote
  2. 2 If the coordinate of \(p_i\) is \((x_{p_i},y_{p_i})\), \(x_{p_i} \ne x_{p_j}\) and \(y_{p_i} \ne y_{p_j}\) for any i and j \((i \ne j)\).

    Footnote
  3. 3 If the coordinate of \(p_i\) is \((x_{p_i},y_{p_i},z_{p_i})\), \(x_{p_i} \ne x_{p_j}\) and \(y_{p_i} \ne y_{p_j}\) for any i and j \((i \ne j)\).

    Footnote

REFERENCES

  1. [1] Chen Yiting and Kim Dae Hyun. 2017. A legalization algorithm for multi-tier gate-level monolithic three-dimensional integrated circuits. In Proceedings of the 2017 18th International Symposium on Quality Electronic Design (ISQED’17). 277282. Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Chu Chris and Wong Yiu-Chung. 2008. FLUTE: Fast lookup table based rectilinear Steiner minimal tree algorithm for VLSI design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27, 1 (2008), 7083. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Gopireddy Bhargava and Torrellas Josep. 2019. Designing vertical processors in monolithic 3D. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA’19). ACM, New York, NY, 643656. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Kim Dae Hyun, Athikulwongse Krit, and Lim Sung Kyu. 2013. Study of through-silicon-via impact on the 3-D stacked IC layout. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21, 5 (2013), 862874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Kim Dae Hyun, Topaloglu Rasit Onur, and Lim Sung Kyu. 2012. Block-level 3D IC design with through-silicon-via planning. In Proceedings of the 17th Asia and South Pacific Design Automation Conference. 335340. Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Ku Bon Woong, Chang Kyungwook, and Lim Sung Kyu. 2018. Compact-2D: A physical design methodology to build commercial-quality face-to-face-bonded 3D ICs. In Proceedings of the 2018 International Symposium on Physical Design (ISPD’18). ACM, New York, NY, 9097. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Lee Young-Joon and Lim Sung Kyu. 2013. Ultrahigh density logic designs using monolithic 3-D integration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 32, 12 (2013), 18921905. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Lin Chung-Wei, Huang Shih-Lun, Hsu Kai-Chi, Lee Meng-Xiang, and Chang Yao-Wen. 2008. Multilayer obstacle-avoiding rectilinear Steiner tree construction based on spanning graphs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27, 11 (2008), 20072016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Lin Sheng-En David and Kim Dae Hyun. 2018. Construction of all rectilinear Steiner minimum trees on the Hanan grid. In Proceedings of the 2018 International Symposium on Physical Design (ISPD’18). ACM, New York, NY, 1825. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Lin Sheng-En David and Kim Dae Hyun. 2019. Construction of all multilayer monolithic rectilinear Steiner minimum trees on the 3D Hanan grid for monolithic 3D IC routing. In Proceedings of the 2019 International Symposium on Physical Design (ISPD’19). ACM, New York, NY, 5764. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Lin Sheng-En David and Kim Dae Hyun. 2019. Wire length characteristics of multi-tier gate-level monolithic 3D ICs. IEEE Transactions on Emerging Topics in Computing 7, 2 (2019), 301310. Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Lin Sheng-En David and Kim Dae Hyun. 2020. Construction of all rectilinear Steiner minimum trees on the Hanan grid and its applications to VLSI design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 39, 6 (2020), 11651176. Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Lin Sheng-En David, Pande Partha Pratim, and Kim Dae Hyun. 2016. Optimization of dynamic power consumption in multi-tier gate-level monolithic 3D ICs. In Proceedings of the 2016 17th International Symposium on Quality Electronic Design (ISQED’16). 2934. Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Liu Chih-Hung, Lin Chun-Xun, Chen I-Che, Lee D. T., and Wang Ting-Chi. 2014. Efficient multilayer obstacle-avoiding rectilinear Steiner tree construction based on geometric reduction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 33, 12 (2014), 19281941. Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Lu Jingwei, Zhuang Hao, Kang Ilgweon, Chen Pengwen, and Cheng Chung-Kuan. 2016. EPlace-3D: Electrostatics based placement for 3D-ICs. In Proceedings of the 2016 International Symposium on Physical Design (ISPD’16). ACM, New York, NY, 1118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Nam Gi-Joon. 2006. ISPD 2006 placement contest: Benchmark suite and results. In Proceedings of the 2006 International Symposium on Physical Design (ISPD’06). ACM, New York, NY, 167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Nam Gi-Joon, Alpert Charles J., Villarrubia Paul, Winter Bruce, and Yildiz Mehmet. 2005. The ISPD2005 placement contest and benchmark suite. In Proceedings of the 2005 International Symposium on Physical Design (ISPD’05). ACM, New York, NY, 216220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Panth Shreepad, Samadi Kambiz, Du Yang, and Lim Sung Kyu. 2013. High-density integration of functional modules using monolithic 3D-IC technology. In Proceedings of the 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC’13). 681686. Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Panth Shreepad, Samadi Kambiz, Du Yang, and Lim Sung Kyu. 2014. Design and CAD methodologies for low power gate-level monolithic 3D ICs. In Proceedings of the 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED’14). 171176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Panth Shreepad, Samadi Kambiz, Du Yang, and Lim Sung Kyu. 2015. Placement-driven partitioning for congestion mitigation in monolithic 3D IC designs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34, 4 (2015), 540553. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Panth Shreepad, Samadi Kambiz, Du Yang, and Lim Sung Kyu. 2017. Shrunk-2-D: A physical design methodology to build commercial-quality monolithic 3-D ICs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 36, 10 (2017), 17161724. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Samal Sandeep Kumar, Nayak Deepak, Ichihashi Motoi, Banna Srinivasa, and Lim Sung Kyu. 2016. Monolithic 3D IC vs. TSV-based 3D IC in 14nm FinFET technology. In Proceedings of the 2016 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S’16). 12. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Construction of All Multilayer Monolithic RSMTs and Its Application to Monolithic 3D IC Routing

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Article Metrics

            • Downloads (Last 12 months)491
            • Downloads (Last 6 weeks)130

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader